A unified law of robustness for Bregman divergence losses
A unified law of robustness for Bregman divergence losses
In contemporary deep learning practice, models are often trained to near zero loss i.e. to nearly interpolate the training data. However, the number of parameters in the model is usually far more than the number of data points $n$, the theoretical minimum needed for interpolation: a phenomenon referred to as …