Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks
Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks
Deep networks are typically trained with many more parameters than the size of the training dataset. Recent empirical evidence indicates that the practice of overparameterization not only benefits training large models, but also assists – perhaps counterintuitively – building lightweight models. Specifically, it suggests that overparameterization benefits model pruning / …