Font Size: a A A

Research On Model Compression Method Of Deep Convolution Neural Network

Posted on:2021-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:H LanFull Text:PDF
GTID:2428330623468305Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,deep learning technology represented by convolutional neural network has shown excellent performance in computer vision,natural language processing,recommendation system and other scenes.At the same time,the innovation of training convolution neural network and the improvement of computer computing power make the deep convolution neural network widely used and deployed.However,the high computational cost is behind the excellent performance of the deep convolution network.The deep convolution network is mainly composed of multi-layer convolution layers.This structure leads to a large number of parameters in the model and the operation of the model also needs a lot of calculation,which hinders its application in some devices which have low computing power,low power consumption or low storage.The model compression method for the deep convolution neural network can reduce the parameters and calculation of deploying the deep convolution neural network model,which makes it possible to deploy it on the equipment with limited resources.In this thesis,the model compression method of deep convolution neural network is studied The main contributions of this thesis are as follows:(1)a model compression method based on the combination of knowledge distillation and model quantization is proposed to solve the problems of multiple network layers and low computational efficiency in the deep convolution neural network.In this method,an untrained deep convolution neural network is used as the teacher's network,and an untrained shallow ternary convolution neural network is used as the student's network.In the process of training,the knowledge of teacher network from training set is transferred to shallow ternary student network to guide the training of student network.After training,the results show that the shallow ternary student network can learn the task ability of the teacher network well.Compared with the huge teacher network,the model parameters and deployment of the shallow ternary network need less computation.(2)A pruning method based on the primary and secondary network model is proposed to solve the problem of large computing resources and low computing efficiency of convolution kernel in deep convolution neural network.In this method,an untrained deep convolution neural network is used as the main network to be pruned,and a convolution neural network with the same structure as the main network but the parameters of the network are three valued quantization is used as the secondary network.The primary and secondary networks are trained with the same training data at the same time.In the process of training,a position parameter in the primary network is adjusted or clipped by the corresponding position parameter in the secondary network.After the training,a large number of useless parameters in the convolution kernel of the main network will be clipped,and the storage space and computation required for its operation will be reduced,and the performance of the model will remain unchanged or even better.(3)The deep learning model is used to construct the model compression framework of the deep convolution neural network,and the effectiveness of the proposed method is verified by comparing other deep convolution neural network model compression methods on multiple open datasets.In conclusion,in order to reduce the amount of parameters and calculation required for the deployment of deep convolution network,this thesis proposes several model compression algorithms,and carries out experimental verification and result analysis.
Keywords/Search Tags:model compression, deep convolutional neural network, knowledge distillation, network quantization, network pruning
PDF Full Text Request
Related items