A Bayesian Perspective on Generalization and Stochastic Gradient Descent

Type: Preprint

Publication Date: 2017-10-17

Citations: 10

Locations

  • arXiv (Cornell University) - View

Similar Works

Action Title Year Authors
+ A Bayesian Perspective on Generalization and Stochastic Gradient Descent 2017 Samuel Smith
Quoc V. Le
+ Understanding Generalization and Stochastic Gradient Descent 2017 Samuel Smith
Quoc V. Le
+ Train longer, generalize better: closing the generalization gap in large batch training of neural networks 2017 Elad Hoffer
Itay Hubara
Daniel Soudry
+ On the Generalization Benefit of Noise in Stochastic Gradient Descent 2020 Samuel Smith
Erich Elsen
Soham De
+ A Bayesian Perspective on Training Speed and Model Selection 2020 Clare Lyle
Lisa Schut
Binxin Ru
Yarin Gal
Mark van der Wilk
+ On the different regimes of Stochastic Gradient Descent 2023 Antonio Sclocchi
Matthieu Wyart
+ On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima 2016 Nitish Shirish Keskar
Dheevatsa Mudigere
Jorge Nocedal
Mikhail Smelyanskiy
Ping Tang
+ On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima 2016 Nitish Shirish Keskar
Dheevatsa Mudigere
Jorge Nocedal
Mikhail Smelyanskiy
Ping Tang
+ Interplay Between Optimization and Generalization of Stochastic Gradient Descent with Covariance Noise. 2019 Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
+ Three Factors Influencing Minima in SGD 2017 Stanisław Jastrzȩbski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
+ Understanding deep learning requires rethinking generalization 2016 Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
+ Understanding deep learning requires rethinking generalization 2016 Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
+ The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning 2023 Nikhil Ghosh
Spencer Frei
Wooseok Ha
Bin Yu
+ PDF Chat Deep learning: a statistical viewpoint 2021 Peter L. Bartlett
Andrea Montanari
Alexander Rakhlin
+ Deep learning: a statistical viewpoint 2021 Peter L. Bartlett
Andrea Montanari
Alexander Rakhlin
+ Deep learning: a statistical viewpoint 2021 Peter L. Bartlett
Andrea Montanari
Alexander Rakhlin
+ How Two-Layer Neural Networks Learn, One (Giant) Step at a Time 2023 Yatin Dandi
Florent Krząkała
Bruno Loureiro
Luca Pesce
Ludovic Stephan
+ An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise 2019 Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
+ Stochastic Training is Not Necessary for Generalization 2021 Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
+ The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent 2020 X. Qian
Diego Klabjan

Works Cited by This (0)

Action Title Year Authors