Ask a Question

Prefer a chat interface with context about you and your work?

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

The gradient noise (GN) in the stochastic gradient descent (SGD) algorithm is often considered to be Gaussian in the large data regime by assuming that the classical central limit theorem (CLT) kicks in. This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a …