AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates
AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates
Structured weight pruning is a representative model compression technique of DNNs to reduce the storage and computation requirements and accelerate inference. An automatic hyperparameter determination process is necessary due to the large number of flexible hyperparameters. This work proposes AutoCompress, an automatic structured pruning framework with the following key performance …