Variational Channel Distribution Pruning and Mixed-Precision Quantization for Neural Network Model Compression
This paper presents a model compression frame-work for both pruning and quantizing according to the channel distribution information. We apply the variational inference technique …
WAN ting chang