Convolutional Neural Network Compression By Fusing Weight And Filter Pruning

Posted on:2020-06-11

Degree:Master

Type:Thesis

Country:China

Candidate:J L Han

Full Text:PDF

GTID:2428330602964241

Subject:Information processing and Internet of Things technologies

Abstract/Summary:

PDF Full Text Request

Convolutional neural networks have been widely applied in the field of computer vision due to its strong ability of feature learning and feature representation.Recent years,it has become a research hotspot in computer vision.However,the complex neural network model has many problems,such as large parameter scale,high computing demand and large storage space occupation,which cannot be transplanted to embedded devices or mobile terminals to meet the needs in practical applications.Therefore,it is very important to study the compression of convolutional neural network.In this paper,we compared existing convolutional neural network compression methods and proposed a new model compression method of fusing weight and filter pruning.The classical application field of convolutional neural networks are object detection and image segmentation.To verify the effectiveness of the proposed method,we respectively compressed the object detection network SSD300 and image segmentation network FCN.The main research content of this topic covers the following aspects:Firstly,studied the current mainstream model compression methods,we proposed a new compression method of fusing weight and filter pruning according to the sparse feature of weight pruning and the relationship of weight and convolutional computation.Firstly,the redundant weight is pruned to obtain the sparsity of effective weights of each convolution layer.Then,to reduce the computation in convolution layer,the redundant filters are pruned according to the percentage of weights in each layer.Finally,the pruned neural network is trained to restore its performance.Secondly,we compressed the SSD300 network with the method of fusing weight and filter pruning.After compression,the storage of SSD300 neural network required is 12.5M and the detection speed is 50FPS.The fusion of weight and filter pruning achieves the result by 2x speed-up,which reduces the storage required by SSD300 by 8.4x,as little increase of error as possible.The weight and filter pruning method makes it possible for SSD300 to be embedded in intelligent systems to detect and track objects.Thirdly,we compressed the FCN network model for feature extraction of welding seam.Our method reduced the storage required by FCN by 22.0�,from 512MB to 22.3MB.This allows fitting the model into on-chip SRAM cache.Our compression method also facilitates the use of FCN in weld tracking system.

Keywords/Search Tags:

convolutional neural network, model compression, weight pruning, filter pruning

PDF Full Text Request

Related items

1	Convolutional Neural Network Compression By Fusing Weight And Filter Pruning
2	Research On Convolutional Neural Network Compression Method Based On Dynamic Pruning And Weight Resetting
3	Research On Compression Method For Convolutional Neural Network Based On Pruning
4	Research And Application Of Neural Network Model Compression Based On Weight Pruning
5	Research On Deep Neural Network Model Compression Method Based On Parameter Pruning
6	Research On Channel Pruning Algorithm Of Convolutional Neural Networks
7	The Research On Algorithm Optimization Of Convolutional Neural Network Model Compression
8	Research On Multi-grained Pruning Algorithm Of Convolutional Neural Networks
9	Research On Deep Network Compression Method Based On Model Gradient Information
10	The Study Of Pruning Methods Of Deep Neural Network