Research And Implementation On Deep Convolutional Neural Network Compression Algorithm

Posted on:2020-01-01

Degree:Master

Type:Thesis

Country:China

Candidate:Q Jia

Full Text:PDF

GTID:2428330575498585

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Deep convolutional neural networks in deep learning have become a very common technique in computer vision and artificial intelligence applications,and have achieved huge harvest in various tasks such as image classification and recognition,object detection.However,most network models are computationally expensive and memory intensive.They are difficult to deploy on portable mobile platforms such as modern smart phones,self-driving cars and other micro devices.So it is especially significant to compress model and make the network structure from intensive to sparse.Weight pruning leverages the redundancy in the number of weights in DCNNs,while weight clustering reduce the redundancy of weight representations and multiplication overhead by weights sharing.However,separate pruning or clustering can only achieve a limited compression ratio.Moreover,the current pruning lacks of effective theoretical basis,which is also complicated to manually preset the pruning rate of each layer.And the number of categories and the selection of the centroid are lack of adaptability in the clustering process.Thesis focuses on the weighting pruning and clustering algorithms as follows:(1)The global dynamic weight pruning saliency-based algorithm calculate the weight gradient information generated by all data samples during training process.The product of the normalized gradient value and the value of current weights is used as the final saliency representation,which indicates a more comprehensive measurement of how much contribution to network performance.The algorithm can automatically obtain the hierarchical pruning rate and pay full attention to the inter-network layer correlation,by setting a global pruning rate.Finally,the pruning and retraining execute iteratively to rectify the error pruning and guarantee the performance of the pruned network.The experimental results show that our pruning method can effectively reduce parameter redundancy and improve network accuracy.(2)Weight pruning for dedicated hardware acceleration circuit calculates the saliency representation and prune weight of convolution and fully-connection layer separately,which balances the pruned parameter quantities and the amounts of computation.The experimental results confirm that our pruning method significantly reduces the computation overhead,and make the pruned sparse network easily deploy the hardware platform to accelerate computation.(3)Adaptive clustering model compression algorithm consists of two steps.Firstly,all sample objects are scanned to construct clustering feature maps by BIRCH hierarchical clustering,and the numbers of cluster can be adaptively obtained to avoid manual setting.Then,K-Means++clustering algorithm is introduced to adaptively select the initial centroids of each category.The Experimental results display that the effective union of BIRCH and K-Means++in our adaptive clustering algorithm can obtain more reasonable numbers and centers of clustering,and realize more weights sharing while ensuring network accuracy,which make the whole clustering process efficient and automatic.

Keywords/Search Tags:

Network compression, Dense network, Weight pruning, Weight clustering, Weight sharing, Sparse network

PDF Full Text Request

Related items

1	Research And Application Of Neural Network Model Compression Based On Weight Pruning
2	Research And Application Of Model Compression Based On Convolutional Neural Network
3	Convolutional Neural Network Compression By Fusing Weight And Filter Pruning
4	Research On Convolutional Neural Network Compression Method Based On Dynamic Pruning And Weight Resetting
5	Pruning-Based Compression Method For Convolutional Neural Network
6	The Research On Algorithm Optimization Of Convolutional Neural Network Model Compression
7	Research And Application Of Weight Initialization Of Convolutional Neural Network
8	The Research On Neural Network With Walsh Weight Function And Its Application
9	An FPGA-based Accelerator For Sparse Neural Networks
10	Effects Of Weight On Network Efficiency