Font Size: a A A

Research On Model Compression And Acceleration Algorithm Based On Channel Selection

Posted on:2019-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:B B HanFull Text:PDF
GTID:2428330545497905Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,deep learning has become the most advanced technology of machine learning tasks such as computer vision,speech recognition and Natural Language Processing,and has made breakthroughs in these areas.Nevertheless,deep learning algorithm is computationally intensive and storage intensive,which makes it difficult to deploy to embedded systems with limited hardware resources.Therefore,it is of great practical significance to reduce the parameters of the model and the amount of calculation,to compress the size of the model and to speed up the operation speed of the model.In this article,two kinds of model compression and acceleration algorithms based on channel pruning and a unified channel pruning compression framework are proposed.The first one is channel pruning method based on entropy.It calculate the entropy of the activation tensor of each channel in each layer,evaluate the size of the channel information and evaluate the importance of the channel.Through sorting the entropy score,the relative channels with lower entropy can be removed,so as to achieving the purpose of compression and acceleration model.The second one is the optimal subset channel pruning method.Activation tensor in each layer is not only the output of the upper layer but also the input of the next layer.The optimal subset of the activation tensor can be achieved using greedy algorithm.It can replace the original tensor maximally to input the next convolutional layer,producing the approximate output.Then the other channel which is not in the optimal subset can be removed to the compress and accelerate the model.In the process of pruning,it adopts the strategy of pruning and fine-tuning layer by layer.It cut only one layer at a time,reducing the performance loss as much as possible.After each pruning,a fine-tuning exercise was performed to restore the model performance as much as possible.Each layer is pruned alternately and iteratively.After the last pruning,a careful fine-tuning will be performed to get the final compression model.This article verifies the effectiveness of the method on various datasets and multiple convolutional network structures.Entropy-based algorithm for channel pruning model compression and acceleration takes face recognition as the basic task,sets up experiments on small data sets and big data,and verifies the effectiveness on different network structures(such as residual network).The experimental results show that:The entropy-based channel selection method can well calculate the amount of channel information and evaluate its importance.Model compression and acceleration algorithms with channel pruning method based on optimal subsets are based on multi-objective classification task.Experiments are also set up on different datasets and network structures.Different pruning methods for channel selection are compared.Experimental results show that optimal subset can well characterize the overall characteristics of all channels.Channels outside the subset can be safely removed without affecting the model performance.At the same time,the experimental results also verify the effectiveness of the pruning strategy in this paper.
Keywords/Search Tags:Deep Learning, Model Compression, Channel Selection
PDF Full Text Request
Related items