Font Size: a A A

Research On Deep Neural Networks Compression Based On Sensitivity Pruning Method

Posted on:2020-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2428330596479278Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the increasing demand for artificial intelligence devices,deep neural network algorithms have been applied and developed more and more.However,the problems are also obvious.The size of the deep learning model grows geometrically as the size of the data set increases A deep convolutional neural network usually involves millions of parameter layers,which greatly increases the storage capacity requirements of the system.Due to factors such as mobile device specifications and battery capacity,system storage capacity often fails to meet the requirements of deep convolutional neural networks.In order to solve this problem,this paper proposes a method of sensitivity pruning for how to compress deep neural networks,which reduces the requirements for system storage.Sensitivity pruning is divided into three steps:Firstly,initial training of the deep network model;Secondly,in order to achieve network sparseness,sensitivity is used as a measure of pruning to remove connections that contribute little to the network;Finally,training again determines the weight of the current sparse network connection,thereby achieving initial compression of the network without loss of network accuracy.On this basis,the k-means++clustering algorithm is used to cluster the weights of each layer,so that multiple nodes can share the same cluster center value,thus effectively reducing the number of weights that need to be stored.Then,the cluster center value is used to realize network quantification,retraining,and get the updated cluster center.The quantization process can further compress the size of the network to reduce the occupation of network memory and achieve the purpose of compression.In this paper,the deep convolutional neural network such as LeNet network,VGG-16 network and AlexNet network is compressed under the caffe platform.Compared with the trial and error method,the sensitivity-based network pruning method proposed in this paper is more effective.The LeNet-5 network has been compressed by 37.5 times,which is 1.5 times higher than the trial and error method;LeNet-300-100 is compressed 35 times,which is 1 times higher than trial and error;The VGG-16 network and the AlexNet network are compressed 41 times and 34 times,respectively.The compression ratio is increased without loss of accuracy,which makes it possible to apply complex deep neural networks to mobile applications.
Keywords/Search Tags:sensitivity pruning, deep neural network, k-means++ clustering, compression of deep network
PDF Full Text Request
Related items