Font Size: a A A

Research On Compression And Acceleration Of Deep Neural Network Based On Model Pruning

Posted on:2021-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:J H XuFull Text:PDF
GTID:2518306476450274Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Although the deep neural network has achieved great success recently,but with its high performance comes the high computation and storage costs.In order to apply deep neural network to mobile devices,which motivates the researches on acceleration and compression of neural networks.We focuses on the compression and acceleration of neural network based on model pruning,in order to reduce the parameters of network effectively while maintain the performance of network.In this thesis,on the one hand,the weight absolute value and the corresponding change in the iterative pruning are used as the basis of the importance judgment,on the other hand,the pruning framework and knowledge distillation are combined to achieve better compression effect.This thesis' s contributions are as follows:Firstly,we introduce the basic components of convolution neural network,including convolution operation,pooling operation and activation function,and summarize the regularization methods.This paper introduces motivation and basic implementation methods of various model compression and acceleration methods,including weight quantization,low-rank decomposition,model pruning and knowledge distillation.Pruning algorithm based on the absolute value of weights was simulated from the perspectives of structured pruning and nonstructured pruning,which demonstrate the performance of model pruning algorithm.Secondly,based on the importance criterion of weight absolute value,we propose an iterative pruning method combined with weight variations.The addition of weight variation can alleviate the problem of relying on the absolute value of weight too much,and can well protect those weights with small weight currently but rapid change.Compared with the pruning method based on weight absolute value,VGG-16 and Res Net-56 can improve the test accuracy by 0.18% and0.37% on CIFAR-10 when the pruning ratio is 95% and 90% respectively,while VGG-16 and Res Net-56 can improve the test accuracy by 0.69% and 0.51% under 80% and 60% of the pruning ratio in CIFAR-10 experiment of structured pruning.The fixed proportion pruning is changed into the progressive iterative pruning,more parameters are cut at the initial stage of pruning,but the further pruning reduces the pruning strength so as not to damage the structure of the network and effectively improve the precision of the sparse model.Thirdly,according to the essence of pruning is to obtain efficient network structure instead of inheriting the corresponding weight,the fine-tuning steps in the pruning framework are replaced by reinitializing training,and the pruning method combined with weight variation is verified on CIFAR-10 dataset.The experimental results show that compared with the finetuning scheme,the accuracy of VGG-16 with the reinitialized training can be improved by 0.02%when the pruning ratio is 95%,while Res Net-56 loses 0.13%.In structured pruning experiment,VGG-16 and Res Net-56 improved 0.46% and 0.13% testing accuracy when pruning ratio was90% and 80%,respectively.Finally,under the pruning framework of reinitialized training,the training is combined with knowledge distillation.In the pruning process,the original network is regarded as the teacher network,and the sparse network is regarded as the student network.By using the soft target from the teacher network to train the sparse network,the accuracy loss caused by pruning can be compensated.The validation results on CIFAR-10 show that the accuracy of VGG-16 and Res Net-56 can be improved by 0.27% and 0.83% respectively when the pruning ratio is 95%,while that of structured pruning is more obvious,the accuracy gains of VGG-16 and Res Net-56 are 1.4% and 1.31% respectively at 90% and 80% pruning ratio.
Keywords/Search Tags:Deep neural network, Model compression and acceleration, Model pruning, Iterative pruning, Knowledge distillation
PDF Full Text Request
Related items