ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization
ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization
Stochastic gradient descent (SGD) is a widely used method for its outstanding generalization ability and simplicity. Adaptive gradient methods have been proposed to further accelerate the optimization process. In this paper, we revisit existing adaptive gradient optimization methods with a new interpretation. Such new perspective leads to a refreshed understanding …