Font Size: a A A

Research And Improvement Of Model Compression Method Based On Deep Convolutional Neural Network

Posted on:2020-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ChenFull Text:PDF
GTID:2428330596995348Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of technology,the AI represented by the deep neural network is recognized by everyone.Today,deep neural networks have been widely used in computer vision,speech,natural language and other fields,and have achieved great success.Neural networks are an indispensable part of deep learning.However,the parameters of deep neural network models are huge,which makes migration learning difficult,hardware requirements are huge,there is no training on smaller hardware,or these better models cannot be used.How to compress the deep convolutional neural network model to make the model run on smaller hardware devices is obvious.Therefore,the compression method of a deep network model proposed in this paper mainly includes the following three steps.:(1)Firstly,an adaptive repetitive pruning algorithm is proposed,which performs threshold pruning on the weight of the network,trims it by threshold increment,and performs recovery training on the network to find the connection that affects the accuracy of the model,and then repeats the clipping weight.Connect,save important connections,cut unimportant connections,and ensure the accuracy of the network.The first step of compression of the network model is completed,so that the deep network parameters are effectively reduced.(2)Then,use Mini Batch-K-means clustering algorithm to cluster each layer of weights.After clustering each layer of weights,the obtained cluster center will represent the weight of each layer,so that weights can be realized.The effect of sharing,get the code book,and then the accuracy recovery training of the code book,keeping the accuracy unchanged.The cluster center is quantized to reduce the number of bits needed to represent the weight,thereby reducing the space required by the network parameters.Further deep compression of deep neural networks.(3)Finally,for the pruning and weight-shared quantized parameters,the Huffman coding technique is used to solve the inconsistency of the weight coding length,which leads to the index redundancy problem of the model when stored,and the effectivecompression of the network is completed again.Further reducing the space required by the network.In this paper,experiments on Le-5 network,AlexNet network and VGG-16 network are carried out based on TensorFlow framework.Experiments show that the proposed method effectively compresses the parameters of the network model,and reduces the complexity of the model,while the accuracy of the model is not reduced.At the same time,this paper also conducts comparative experiments on parameter adjustment in the process of precision recovery training.Finally,the effectiveness of the compression algorithm is proved by comparing other compression methods.In the end,under the premise of constant accuracy,it was compressed 40 times on the Le-5 network,37 times on the AlexNet network,and 55 times on the VGG-16 network.
Keywords/Search Tags:deep neural network, model compression method, pruning and quantization, Huffman coding
PDF Full Text Request
Related items