Research On Compression Method Of Deep Neural Network Model

Posted on:2021-07-11

Degree:Master

Type:Thesis

Country:China

Candidate:X J Zhu

Full Text:PDF

GTID:2518306050965999

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of big data and computer hardware,it becomes possible to train large deep neural network(DNN).Therefore,deep neural networks have been extensively studied in recent years and applied to artificial intelligence fields such as face recognition,image classification,autonomous driving,and natural language processing.Although deep neural networks have excellent performance in these fields,and they have been able to lead humans in some tasks.However,most of the deep neural network models have the disadvantages of large storage consumption and large calculation amount required for operation,which severely limits their deployment on low-performance devices such as mobile phones.Therefore,it is necessary to compress the deep neural network models to reduce the storage consumption of the model and the amount of calculation required for operation to facilitate the transmission of the model and the operation of the model on low-performance devices.So studying deep neural network model compression has great significance for the deployment of models on low-performance devices.For the shortcomings of the deep neural network model,this article uses the model pruning method for the model,which greatly reduces the size of the model,and at the same time obtains the actual acceleration effect on the CPU and GPU.The main research contents of this article are as follows.In this paper,a new convolution kernel structure L₂-Conv is proposed.L₂-Conv performs gradient-free separation of the convolution kernel before forward propagation,and separates it into directional convolution kernel and convolution kernel radius.By updating the radius of the convolution kernel,the L₂ norm value of the model convolution kernel can be opened,and the deficiencies of using the L₂ norm to rank the importance of the convolution kernel are solved.It makes the important convolution kernel and non-important convolution kernel of the model easier to distinguish,which is beneficial to the convolution kernel pruning of the model.The accuracy of the model built using L₂-Conv on the cifar10 and cifar100datasets will not be affected,and no additional calculation will be added when the model is deployed.This paper proposes the L₂-prune algorithm based on the L₂-Conv convolution structure.The L₂-prune algorithm uses an iterative pruning method.The L₂-prune algorithm includes the pruning rate setting algorithm proposed in this paper,the setting algorithm of each pruning rate during iterative pruning,and the setting method of hyperparameters such as the learning rate.The pruning rate setting algorithm will automatically give the pruning rate of each layer of the model according to the importance,calculation volume,storage consumption and other factors of the model,greatly simplifying the pruning operation of the model.By setting different pruning rates for different layers of the model,the model can reach a preset compression rate or calculation volume reduction rate,while avoiding trimming important convolution kernels.This paper compares the L₂-prune pruning algorithm with SFP,FPGM,APOZ and other methods.Under the same setting of the pruning rate,the L₂-prune pruning method proposed in this paper is more accurate than the comparison method on some models.There is a certain lead on it,and even if there is no lead,it is on the same level as the optimal algorithm.In the pruning experiment,L₂-prune can reduce the calculation amount of the Res Net56 model on the cifar10 dataset by 60%without loss of accuracy.Compress the VGG16?bn model volume on the cifar10 dataset by 10 times,and obtain a 2.34 times acceleration effect on the GPU and a 3.62 times acceleration effect on the CPU.

Keywords/Search Tags:

Deep Neural Network, Model pruning, Pruning rate, Convolution kernel

PDF Full Text Request

Related items

1	Research On Deep Neural Network Model Compression Algorithm Based On Convolution Kernel Pruning
2	Research On Channel Flexible Pruning On Deep Convolutional Neural Networks
3	The Study Of Pruning Methods Of Deep Neural Network
4	Research On Filter Prunning Method Of Deep Convolution Neural Network
5	Research And Application Of Structured Model Compression Algorithm In Deep Neural Network
6	Research On Active Stepwise Pruning Method Of Deep Convolution Network Model
7	Research On Structured Pruning Algorithm Of Convolution Neural Networks
8	Deep Convolutional Neural Networks Pruning Research Based On Similarity
9	Research On Fine-Grained Model Pruning Algorithm
10	Research On Compression And Acceleration Of Deep Neural Network Based On Model Pruning