Prefer a chat interface with context about you and your work?
Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks
Effective training of deep neural networks suffers from two main issues. The first is that the parameter space of these models exhibit pathological curvature. Recent methods address this problem by using adaptive preconditioning for Stochastic Gradient Descent (SGD). These methods improve convergence by adapting to the local geometry of parameter …