SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks
SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks
Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary …