Font Size: a A A

Convolutional Neural Network Compression By Fusing Weight And Filter Pruning

Posted on:2020-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:J L HanFull Text:PDF
GTID:2428330602964241Subject:Information processing and Internet of Things technologies
Abstract/Summary:PDF Full Text Request
Convolutional neural networks have been widely applied in the field of computer vision due to its strong ability of feature learning and feature representation.Recent years,it has become a research hotspot in computer vision.However,the complex neural network model has many problems,such as large parameter scale,high computing demand and large storage space occupation,which cannot be transplanted to embedded devices or mobile terminals to meet the needs in practical applications.Therefore,it is very important to study the compression of convolutional neural network.In this paper,we compared existing convolutional neural network compression methods and proposed a new model compression method of fusing weight and filter pruning.The classical application field of convolutional neural networks are object detection and image segmentation.To verify the effectiveness of the proposed method,we respectively compressed the object detection network SSD300 and image segmentation network FCN.The main research content of this topic covers the following aspects:Firstly,studied the current mainstream model compression methods,we proposed a new compression method of fusing weight and filter pruning according to the sparse feature of weight pruning and the relationship of weight and convolutional computation.Firstly,the redundant weight is pruned to obtain the sparsity of effective weights of each convolution layer.Then,to reduce the computation in convolution layer,the redundant filters are pruned according to the percentage of weights in each layer.Finally,the pruned neural network is trained to restore its performance.Secondly,we compressed the SSD300 network with the method of fusing weight and filter pruning.After compression,the storage of SSD300 neural network required is 12.5M and the detection speed is 50FPS.The fusion of weight and filter pruning achieves the result by 2x speed-up,which reduces the storage required by SSD300 by 8.4x,as little increase of error as possible.The weight and filter pruning method makes it possible for SSD300 to be embedded in intelligent systems to detect and track objects.Thirdly,we compressed the FCN network model for feature extraction of welding seam.Our method reduced the storage required by FCN by 22.0×,from 512MB to 22.3MB.This allows fitting the model into on-chip SRAM cache.Our compression method also facilitates the use of FCN in weld tracking system.
Keywords/Search Tags:convolutional neural network, model compression, weight pruning, filter pruning
PDF Full Text Request
Related items