Neural Network Model Compression Methods Based On Parameters And Features Redundancy

Posted on:2019-07-05

Degree:Master

Type:Thesis

Country:China

Candidate:W H Yang

Full Text:PDF

GTID:2428330572456399

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,Convolution Neural Network(CNN)has been widely used because it has made great achievements in many fields such as image recognition,speech recognition,and natural language processing.While CNN models are continuously approaching the accuracy limit of computer vision tasks,its depth and size have also multiplied.Problems like large model size,high requirements on hardware resources,large storage costs,and alarming power consumption are main obstacles in deployment and application of the mobile devices and embedded devices.Under such circumstances,it is very important to compress CNN models.In the early days,some researchers proposed a series of model compression methods,such as weight quantization and low rank decomposition.But the compression rate and efficiency were far from satisfactory.At present,researchers have designed the efficient convolution structure instead of the traditional convolution layer to reduce the amount of parameters and calculations in terms of the redundancy of model parameters.However,these methods still have the following disadvantages,such as severe neuron degradation and low generalization ability of the models;In addition,there are certain methods to prune the channel for the issue of feature redundancy,it is effective for structures composed of conventional convolution layers.Hower it's not suitable for high efficiency convolution structures,such as the depthwise separable convolution,high efficiency residual blocks,etc.To solve the above problems,this paper proposes an efficient model compression method based on nonuniform filter groups and a cross-layer pruning method for neural network mdoel based on the analysis of feature redundancy.The main work and contributions of this paper are as follows:1.Firstly,combined with the expanded residual block based on the depthwise separable convolution and the descending sampling layer postposition strategy,this paper proposes an efficient neural network model that solves the existing problems such as severe neuron degradation and low model generalization ability.Furthermore,aimed at the pointwise convolutional layer which has large parameters in our model,this paper proposes a new model compression method based on nonuniform pointwise convolution layer grouping using the nonlinear distribution of filters in the spatial frequency domain.Compared with the existing compression methods based on uniform filter groups,this method is helpful to learn more reasonable responses from input images or features.Finally,the experiment indicates that the neural network model in this paper has the less parameter,the smaller model size,the lower test time,and the higher accuracy rate of the classification task.2.Due to the extremly small amount of parameters of the depthwise convolution layer in the expanded residual block,this paper proposes a cross-layer pruning method for neural network based on the analysis of feature redundancy.The main idea is to use the learnable parameters gamma in the batch normalization layer as the factor to characterize the importance of features,then only pruning the pointwise convolution layer according to this factor and global pruning threshold to better preserve the effective information of depthwise convolutional layer.Above mentioned method overcomes the limitation that the method which can only prune the each convolution layer in CNN models is not suitable for the efficient convolution structures.In the experiment,we first apply the cross-layer pruning to the efficient neural network model in first contribution.In addition,we combine the expanded residual block and the cross-layer pruning to jointly compress the existing classical image denoising models.Different simulations are demonstrated the validity of above mentioned method.

Keywords/Search Tags:

Model Compression, Expanded Residual Block, Nonuniform Pointwise Convolution Layer Grouping, Cross-layer Pruning

PDF Full Text Request

Related items

1	Research On Structured Pruning Algorithm Of Convolution Neural Networks
2	Multi-scale Pedestrian Real-time Detection Based On Convolutional Neural Network
3	Research On Deep Neural Network Model Compression Algorithm Based On Convolution Kernel Pruning
4	QAT Hardware Accelerated Compression In Block Layer Storage
5	Research Of The Cross-layer Resource Scheduling Algorithm In IP+Optical Networks
6	Research On Visual Tracking Algorithm Based On Siamese Network
7	Svc Quality Layer And Cross-layer Transmission In Wireless Network Research
8	Research On Active Stepwise Pruning Method Of Deep Convolution Network Model
9	Research And Application Of Structured Model Compression Algorithm In Deep Neural Network
10	Compression And Acceleration Of Vision Algorithm Model Based On Convolutional Neural Networks