Model Compression Technique For Deep Neural Network

Posted on:2020-12-04

Degree:Master

Type:Thesis

Country:China

Candidate:J Zhong

Full Text:PDF

GTID:2428330626464651

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the promotion of large-scale dataset and the increase of computing power,Convolutional Neural Network(CNN)has achieved great success in face recognition,object detection,tracking and image segmentation.In order to improve the performance,the network structures are designed by the academic deeper and wider,with larger capacity and higher computational complexity.The large-scale network models can run smoothly in the high-frequency computer,but they are hard to be deployed to mobile devices limited by computing resources and power consumption,such as smartphones or embedded devices.In this context and demand,compressing CNN models has attracted considerable attention,where pruning CNN filters has generated great research popularity due to its high compression rate and acceleration ability.By this method,large models can be compressed into lightweight models with approximate performance,and transplanted into mobile devices,so that the academic products can realize greater value in the industry.This paper proposes corresponding model compression algorithms based on pruning from two different perspectives.This paper argues and demonstrates that where to prune is a critical issue in the pruning task,and proposes a filter pruning method based on pruning position learning.Considering the hierarchical structure of CNNs,long short-term memory(LSTM)is employed as an evaluation model to find the least important layer and generate the pruning decision.Firstly,the neural network is transformed into a string representation and fed into LSTM to generate the decision of whether to prune each layer or not.Then a channel-based method is adopted to evaluate the importance of each filter in the chosen layer and some unimportant filters are pruned combined with the recovery mechanism,which recovers the loss of precision preliminarily caused by pruning.LSTM is updated in the policy gradient method with both performance and complexity as the reward.In order to solve the problem of performance degradation caused bypruning task,an adaptive filter pruning method based on attention mechanism is proposed,that is,Squeeze-Excitation-Pruning(SEP)method.SEP module focuses on feature channel dimension,which is utilized to reconstruct the baseline model.The SEP operation is performed in the previous convolutional layer,and it generates an importance weight vector to scale the feature map in the next convolutional layer,then it sets some low importance weight to zero.SEP is a data-dependent adaptive filter pruning method.When it comes to different image data,different filters are soft pruned according to the SEP selection,that is,these convolution operations are ignored.In this paper,detailed and comprehensive experiments are carried out.In order to verify the universality of the algorithm,three different network structures: VGG19,Res Net56 and fully connected network model are experimented on three benchmark datasets: Cifar-10,Cifar-100 and MNIST.The experimental results are compared and analyzed with some existed algorithms.The results show that the proposed pruning methods are capable of compressing a variety of network structures largely with comparable accuracy,which outperform other state-of-the-art methods.

Keywords/Search Tags:

Model Compression, Pruning, Reinforcement Learning, Attention Mechanism

PDF Full Text Request

Related items

1	Research On Structured Pruning Algorithm Of Convolution Neural Networks
2	The Study Of Pruning Methods Of Deep Neural Network
3	Research On Optimization Method Of Reaction Force Field Parameters Based On Reinforcement Learning
4	Implementation,Verification And Compression By Pruning Of Neural Machine Translation Model
5	Research And Implementation Of Convolutional Neural Network Compression Technique
6	Research On Group Confrontation Strategies Based On Deep Reinforcement Learning
7	Product Domain Relation Extraction Based On Reinforcement Learning And Attention Mechanism Denoising
8	Research Of Dialogue Model Based On Deep Learning And Reinforcement Learning
9	Research And Application Of Model Compression Algorithm Based On Pruning-quantization-knowledge Distillation
10	Research On Joint Extraction Model Of Entities And Relations Based On Reinforcement Learning