With the deep learning dominating Computer Vision,Natural Language Processing,etc.in recent years,researchers have proposed more and more classic neural network architectures.A lot of CNN architectures,such as VGG,GoogLeNet,ResNet and DenseNet,have been proposed for image classification.At the same time,the overall complexity of the model is getting higher.Therefore,despite the huge performance gains brought by deep learning,it is impossible to deploy to the terminal and low-latency demand scenarios due to computational bottlenecks.To this end,researchers have proposed a variety of methods of model compression and acceleration to reduce the complexity of the model,and even directly reduce storage and computational overhead.Firstly,this paper introduces a method of convolutional neural network model pruning,Batch Normalization layer's scaling factors γ based channel-level pruning,and our improvement.We propose an improvement in channel selection using local absolute min-max normalization for the ”Chasm Issue” that may result from using absolute global comparison,which introduces intra-layer local information into channel selection;for the impact of ignoring the shift factors β,we improved and proposed the ablation based β transfer;in order to further improve fine-tuning of the slim/pruned model,we proposed knowledge distillation based fine-tuning.The experimental results show that our improved method has obvious model compression and acceleration performance improvement compared with the original method,especially on the convolutional neural network with residual structure module.Then,we have proposed the sigmoid shrinkage based adaptive dynamic sparse constraint(Deep AdaLASSO),for the phenomenon that the sparse constraint will bring the loss of accuracy in the previous experiments.We further improved the local maximum and minimum normalization proposed above,using linear combination to smoothly introduce the local information in the layer.The experimental results show that such improvements based on the previous pruning method can effectively reduce the accuracy loss of the model compression and acceleration.Finally,aiming to address the recession of knowledge distillation based fine-tuning when pruning ratio is too high,we proposed an improved fine-tuning method combining with twostep knowledge distillation.The experimental results show that our improvement in fine-tuning can relieve such recession. |