Font Size: a A A

Research On Model Compression And Acceleration Method Based On Convolutional Neural Network

Posted on:2020-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:T JiangFull Text:PDF
GTID:2428330596976141Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In recent years,the deep learning has developed very rapidly.It has shown excellent performance in many aspects,such as computer vision and natural language.As a common method in deep learning,convolutional neural networks have outstanding performance advantages and have been widely used in engineering.However,the structure of the convolutional neural network model is formed by multi-level cascading,and its training requires a large amount of data,resulting in a large number of model parameters and high computational cost,hindering its applications in the mobile devices with low power consumption,low bandwidth,low storage,and the like.Convolutional neural networks usually contain a lot of redundant parameters,which can be reduced by a series of methods such as model compression and acceleration.This thesis focuses on the model compression and acceleration of convolutional neural networks.The main work and contributions are as follows:(1)Aiming at the problem that the parameter redundancy in convolutional neural networks leads to large model and high calculation cost,a knowledge refining method based on entropy attention is proposed.The method uses a cumbersome model with large parameter quantity and high performance as the teacher network,and a small model with small parameter quantity and low performance as the student network.Based on the parameter redundancy property of the cumbersome model,the information amount is calculated by the information entropy method,and then the active channel is sorted according to the level of the information.The active channel with high information content is selected as the source of the strong supervision information,which is to supervise the student network at the corresponding location with different levels of convolutional layer.The supervisory information supervises the learning of the student network through the weighted sum of the L2 loss and the cross entropy loss of the activation channel corresponding to the teacher network and the student network.The results of iterative training show that the student network obtained in this method can reduce the parameter amount by up to twenty times,and the performance is better than that before unsupervised,which is close to the performance of the teacher network.(2)Aiming at the problem that parameter redundancy is too large and computationally expensive in convolutional neural networks,a model convolution kernel clipping method based on entropy importance criterion is proposed.Firstly,information amount of the activated channel of a convolution layer is obtained by seeking image entropy.Then,the importance of the convolution kernel is measured based on the entropy importance criterion and the convolution kernel with lower information amount is clipped.After iterative training,the method can obtain a compression model that loses small performance or gets higher performance.The parameter size of the compression model can be reduced by about six times and the calculation acceleration increased by about three times.(3)Based on the proposed methods,a network model is built using the deep learning framework.Combined with other model compression and acceleration methods,detailed comparison experiments are carried out on multiple data sets.The experimental results verify the effectiveness of the proposed methods.In summary,based on the parameter redundancy of the convolutional neural network model,several optimization algorithms for model compression and acceleration are proposed.Experimental verification and detailed experimental conclusion analysis are carried out.
Keywords/Search Tags:convolutional neural network, model compression, model acceleration, knowledge refinement, model pruning
PDF Full Text Request
Related items