Flattening Sharpness for Dynamic Gradient Projection Memory Benefits
Continual Learning
Flattening Sharpness for Dynamic Gradient Projection Memory Benefits
Continual Learning
The backpropagation networks are notably susceptible to catastrophic forgetting, where networks tend to forget previously learned skills upon learning new ones. To address such the 'sensitivity-stability' dilemma, most previous efforts have been contributed to minimizing the empirical risk with different parameter regularization terms and episodic memory, but rarely exploring the …