Li chun fang

Variational Channel Distribution Pruning and Mixed-Precision Quantization for Neural Network Model Compression

This paper presents a model compression frame-work for both pruning and quantizing according to the channel distribution information. We apply the variational inference technique …

WAN ting chang