Predicting the outputs of finite deep neural networks trained with noisy gradients
Predicting the outputs of finite deep neural networks trained with noisy gradients
A recent line of works studied wide deep neural networks (DNNs) by approximating them as Gaussian processes (GPs). A DNN trained with gradient flow was shown to map to a GP governed by the neural tangent kernel (NTK), whereas earlier works showed that a DNN with an i.i.d. prior over …