On the Generalization Benefit of Noise in Stochastic Gradient Descent

Type: Preprint

Publication Date: 2020-01-01

Citations: 17

DOI: https://doi.org/10.48550/arxiv.2006.15081

Locations

  • arXiv (Cornell University) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Strength of Minibatch Noise in SGD 2021 Ziyin Liu
Kangqiao Liu
T. Mori
Masahito Ueda
+ On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima 2016 Nitish Shirish Keskar
Dheevatsa Mudigere
Jorge Nocedal
Mikhail Smelyanskiy
Ping Tang
+ On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima 2016 Nitish Shirish Keskar
Dheevatsa Mudigere
Jorge Nocedal
Mikhail Smelyanskiy
Ping Tang
+ Understanding Generalization and Stochastic Gradient Descent 2017 Samuel Smith
Quoc V. Le
+ The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent 2020 X. Qian
Diego Klabjan
+ Three Factors Influencing Minima in SGD 2017 Stanisław Jastrzȩbski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
+ PDF Chat Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent 2024 Hideyuki Umeda
Hideaki Iiduka
+ Stochastic Gradient Descent: Going As Fast As Possible But Not Faster 2017 Alice Schoenauer Sebag
Marc Schoenauer
Michèle Sébag
+ On the Noisy Gradient Descent that Generalizes as SGD 2019 Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
+ On the different regimes of Stochastic Gradient Descent 2023 Antonio Sclocchi
Matthieu Wyart
+ Special Properties of Gradient Descent with Large Learning Rates 2022 Amirkeivan Mohtashami
Martin Jaggi
Sebastian U. Stich
+ Interplay Between Optimization and Generalization of Stochastic Gradient Descent with Covariance Noise. 2019 Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
+ Stochastic Gradient Descent with Large Learning Rate. 2020 Kangqiao Liu
Ziyin Liu
Masahito Ueda
+ The Marginal Value of Momentum for Small Learning Rate SGD 2023 Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
+ Towards Theoretical Understanding of Large Batch Training in Stochastic Gradient Descent 2018 Xiaowu Dai
Yuhua Zhu
+ Towards Better Generalization: BP-SVRG in Training Deep Neural Networks 2019 Hao Jin
Dachao Lin
Zhihua Zhang
+ PDF Chat Increasing Batch Size Improves Convergence of Stochastic Gradient Descent with Momentum 2025 Ken‐ichi Kamo
Hideaki Iiduka
+ On the Overlooked Structure of Stochastic Gradients 2022 Zeke Xie
Qian-Yuan Tang
Zheng He
Mingming Sun
Ping Li
+ An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise 2019 Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
+ A Bayesian Perspective on Generalization and Stochastic Gradient Descent 2017 Samuel Smith
Quoc V. Le