Javier Sagastuy-Breña

Follow

Generating author description...

Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
3
+ Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate 2020 Zhiyuan Li
Kaifeng Lyu
Sanjeev Arora
3
+ Neural Tangent Kernel: Convergence and Generalization in Neural Networks 2018 Arthur Paul Jacot
Franck Gabriel
Clément Hongler
3
+ Three Factors Influencing Minima in SGD 2017 Stanisław Jastrzȩbski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
3
+ PDF Chat Wide neural networks of any depth evolve as linear models under gradient descent <sup>*</sup> 2020 Jaehoon Lee
Lechao Xiao
Samuel S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Sohl‐Dickstein
Jeffrey Pennington
3
+ Stochastic modified equations and adaptive stochastic gradient algorithms 2015 Qianxiao Li
Cheng Tai
E Weinan
2
+ On the Origin of Implicit Regularization in Stochastic Gradient Descent 2021 Samuel Smith
Benoît Dherin
David G. T. Barrett
Soham De
2
+ PDF Chat A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks 2019 Umut Şimşekli
Levent Sagun
Mert Gürbüzbalaban
2
+ On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs) 2021 Zhiyuan Li
Sadhika Malladi
Sanjeev Arora
2
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr Dollár
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
2
+ Exact solutions to the nonlinear dynamics of learning in deep linear neural networks 2013 Andrew Saxe
James L. McClelland
Surya Ganguli
2
+ Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond 2016 Levent Sagun
Léon Bottou
Yann LeCun
2
+ The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size. 2018 Vardan Papyan
2
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
2
+ Spherical Motion Dynamics of Deep Neural Networks with Batch Normalization and Weight Decay. 2020 Ruosi Wan
Zhanxing Zhu
Xiangyu Zhang
Jian Sun
2
+ Analysis Of Momentum Methods 2019 Nikola B. Kovachki
Andrew M. Stuart
2
+ Fluctuation-dissipation relations for stochastic gradient descent 2018 Sho Yaida
2
+ Don't Decay the Learning Rate, Increase the Batch Size 2017 Samuel Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
2
+ Deep Neural Networks as Gaussian Processes 2017 Jaehoon Lee
Yasaman Bahri
Roman Novak
Samuel S. Schoenholz
Jeffrey Pennington
Jascha Sohl‐Dickstein
2
+ One weird trick for parallelizing convolutional neural networks 2014 Alex Krizhevsky
2
+ PDF Chat Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks 2018 Pratik Chaudhari
Stefano Soatto
2
+ The Variational Formulation of the Fokker--Planck Equation 1998 Richard W. Jordan
David Kinderlehrer
Félix Otto
2
+ Gradient Descent Happens in a Tiny Subspace 2018 Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
2
+ On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length 2018 Stanisław Jastrzȩbski
Zachary Kenton
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
2
+ TF-Replicator: Distributed Machine Learning for Researchers 2019 Peter Buchlovsky
David Budden
Dominik Grewe
Chris S. Jones
John Aslanides
Frederic Besse
Andy Brock
Aidan Clark
Sergio Gómez Colmenarejo
Aedan Pope
1
+ Loss Landscapes of Regularized Linear Autoencoders 2019 Daniel Kunin
Jonathan M. Bloom
Aleksandrina Goeva
Cotton Seed
1
+ Width Provably Matters in Optimization for Deep Linear Neural Networks 2019 Simon S. Du
Wei Hu
1
+ Uniform-in-Time Weak Error Analysis for Stochastic Gradient Descent Algorithms via Diffusion Approximation 2019 Yuanyuan Feng
Tingran Gao
Lei Li
Jian‐Guo Liu
Yulong Lu
1
+ Deep Learning without Weight Transport 2019 Mohamed Akrout
Collin Wilson
Peter C. Humphreys
Timothy Lillicrap
Douglas Tweed
1
+ Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures 2018 Sergey Bartunov
Adam Santoro
Blake A. Richards
Luke Marris
Geoffrey E. Hinton
Timothy Lillicrap
1
+ Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model 2019 Guodong Zhang
Lala Li
Zachary Nado
James Martens
Sushant Sachdeva
George E. Dahl
Christopher J. Shallue
Roger Grosse
1
+ Stochastic Gradient Descent as Approximate Bayesian Inference 2017 Stephan Mandt
Matthew D. Hoffman
David M. Blei
1
+ PDF Chat A mean field view of the landscape of two-layer neural networks 2018 Mei Song
Andrea Montanari
Phan-Minh Nguyen
1
+ Learning in the machine: Random backpropagation and the deep learning channel 2018 Pierre Baldi
Peter Sadowski
Zhiqin Lu
1
+ Deep Learning without Poor Local Minima 2016 Kenji Kawaguchi
1
+ Deep Neural Networks as Gaussian Processes 2018 Jaehoon Lee
Yasaman Bahri
Roman Novak
Samuel S. Schoenholz
Jeffrey Pennington
Jascha Sohl‐Dickstein
1
+ Direct Feedback Alignment Provides Learning in Deep Neural Networks 2016 Arild Nøkland
1
+ Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced 2018 Simon S. Du
Wei Hu
Jason D. Lee
1
+ A variational analysis of stochastic gradient algorithms 2016 Stephan Mandt
Matthew D. Hoffman
David M. Blei
1
+ Online Normalization for Training Neural Networks 2019 Vitaliy Chiley
Ilya Sharapov
Atli Kosson
Urs Köster
Ryan Reece
Sofia Samaniego de la Fuente
Vishal Subbiah
Michael James
1
+ On Exact Computation with an Infinitely Wide Neural Net 2019 Sanjeev Arora
Simon S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
1
+ A Simple Baseline for Bayesian Uncertainty in Deep Learning 2019 Wesley J. Maddox
Pavel Izmailov
Timur Garipov
Dmitry Vetrov
Andrew Gordon Wilson
1
+ An Exponential Learning Rate Schedule for Deep Learning 2019 Zhiyuan Li
Sanjeev Arora
1
+ The Implicit Regularization of Stochastic Gradient Flow for Least Squares 2020 Alnur Ali
Edgar Dobriban
Ryan J. Tibshirani
1
+ Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions 2020 Stefano Sarao Mannelli
Eric Vanden‐Eijnden
Lenka Zdeborová
1
+ PDF Chat Anomalous diffusion dynamics of learning in deep neural networks 2022 Guozhang Chen
Cheng Qu
Pulin Gong
1
+ Implicit Gradient Regularization 2020 David G. T. Barrett
Benoît Dherin
1
+ Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD 2020 Ruosi Wan
Zhanxing Zhu
Xiangyu Zhang
Jian Sun
1
+ PDF Chat Stochastic modified equations for the asynchronous stochastic gradient descent 2019 Jing An
Jianfeng Lu
Lexing Ying
1
+ PDF Chat Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification* 2021 Francesca Mignacco
Florent Krząkała
Pierfrancesco Urbani
Lenka Zdeborová
1