Ask a Question

Prefer a chat interface with context about you and your work?

Towards Quantifying the Preconditioning Effect of Adam

Towards Quantifying the Preconditioning Effect of Adam

There is a notable dearth of results characterizing the preconditioning effect of Adam and showing how it may alleviate the curse of ill-conditioning -- an issue plaguing gradient descent (GD). In this work, we perform a detailed analysis of Adam's preconditioning effect for quadratic functions and quantify to what extent …