An analytic theory of shallow networks dynamics for hinge loss classification

Type: Preprint

Publication Date: 2020-01-01

Citations: 0

Abstract

Neural networks have been shown to perform incredibly well in classification tasks over structured high-dimensional datasets. However, the learning dynamics of such networks is still poorly understood. In this paper we study in detail the training dynamics of a simple type of neural network: a single hidden layer trained to perform a classification task. We show that in a suitable mean-field limit this case maps to a single-node learning problem with a time-dependent dataset determined self-consistently from the average nodes population. We specialize our theory to the prototypical case of a linearly separable dataset and a linear hinge loss, for which the dynamics can be explicitly solved. This allow us to address in a simple setting several phenomena appearing in modern networks such as slowing down of training dynamics, crossover between rich and lazy learning, and overfitting. Finally, we asses the limitations of mean-field theory by studying the case of large but finite number of nodes and of training samples.

Locations

  • arXiv (Cornell University) - View - PDF
  • HAL (Le Centre pour la Communication Scientifique Directe) - View

Similar Works

Action Title Year Authors
+ PDF Chat An analytic theory of shallow networks dynamics for hinge loss classification* 2021 Franco Pellegrini
Giulio Biroli
+ On the Learning Dynamics of Deep Neural Networks 2018 RĂ©mi Tachet
Mohammad Zakaria Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
+ On the Learning Dynamics of Deep Neural Networks. 2018 RĂ©mi Tachet des Combes
Mohammad Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
+ Neural networks: from the perceptron to deep nets 2023 Marylou Gabrié
Surya Ganguli
Carlo Lucibello
Riccardo Zecchina
+ Limitations of Lazy Training of Two-layers Neural Networks 2019 Behrooz Ghorbani
Mei Song
Theodor Misiakiewicz
Andrea Montanari
+ Gradient Starvation: A Learning Proclivity in Neural Networks 2020 Mohammad Zakaria Pezeshki
SĂ©kou-Oumar Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
+ Gradient Starvation: A Learning Proclivity in Neural Networks 2020 Mohammad Pezeshki
SĂ©kou-Oumar Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
+ PDF Chat Disentangling feature and lazy training in deep neural networks 2020 Mario Geiger
Stefano Spigler
Arthur Paul Jacot
Matthieu Wyart
+ Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks 2019 Phan-Minh Nguyen
+ PDF Chat Can Shallow Neural Networks Beat the Curse of Dimensionality? A Mean Field Training Perspective 2020 Stephan Wojtowytsch
E Weinan
+ PDF Chat Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss 2020 LĂ©naĂŻc Chizat
Francis Bach
+ Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss 2020 LĂ©naĂŻc Chizat
Francis Bach
+ Understanding the Loss Surface of Neural Networks for Binary Classification 2018 Shiyu Liang
Ruoyu Sun
Yixuan Li
R. Srikant
+ PDF Chat The loss landscape of deep linear neural networks: a second-order analysis 2022 El Mehdi Achour
François Malgouyres
SĂ©bastien Gerchinovitz
+ The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks 2023 Mor Shpigel Nacson
Rotem Mulayoff
Greg Ongie
Tomer Michaeli
Daniel Soudry
+ Limiting fluctuation and trajectorial stability of multilayer neural networks with mean field training. 2021 Huy Tuan Pham
Phan-Minh Nguyen
+ Tilting the playing field: Dynamical loss functions for machine learning 2021 Miguel Ruiz‐Garcia
Ge Zhang
Samuel S. Schoenholz
Andrea J. Liu
+ Limiting fluctuation and trajectorial stability of multilayer neural networks with mean field training 2021 Huy Tuan Pham
Phan-Minh Nguyen
+ Theory III: Dynamics and Generalization in Deep Networks 2019 Andrzej Banburski
Qianli Liao
Brando Miranda
Lorenzo Rosasco
Fernanda De La Torre
Jack D. Hidary
Tomaso Poggio
+ Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions 2020 Stefano Sarao Mannelli
Eric Vanden‐Eijnden
Lenka ZdeborovĂĄ

Works That Cite This (0)

Action Title Year Authors