Research On Deep Neural Networks Compression Based On Sensitivity Pruning Method

Posted on:2020-07-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Wang

Full Text:PDF

GTID:2428330596479278

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

With the increasing demand for artificial intelligence devices,deep neural network algorithms have been applied and developed more and more.However,the problems are also obvious.The size of the deep learning model grows geometrically as the size of the data set increases A deep convolutional neural network usually involves millions of parameter layers,which greatly increases the storage capacity requirements of the system.Due to factors such as mobile device specifications and battery capacity,system storage capacity often fails to meet the requirements of deep convolutional neural networks.In order to solve this problem,this paper proposes a method of sensitivity pruning for how to compress deep neural networks,which reduces the requirements for system storage.Sensitivity pruning is divided into three steps:Firstly,initial training of the deep network model;Secondly,in order to achieve network sparseness,sensitivity is used as a measure of pruning to remove connections that contribute little to the network;Finally,training again determines the weight of the current sparse network connection,thereby achieving initial compression of the network without loss of network accuracy.On this basis,the k-means++clustering algorithm is used to cluster the weights of each layer,so that multiple nodes can share the same cluster center value,thus effectively reducing the number of weights that need to be stored.Then,the cluster center value is used to realize network quantification,retraining,and get the updated cluster center.The quantization process can further compress the size of the network to reduce the occupation of network memory and achieve the purpose of compression.In this paper,the deep convolutional neural network such as LeNet network,VGG-16 network and AlexNet network is compressed under the caffe platform.Compared with the trial and error method,the sensitivity-based network pruning method proposed in this paper is more effective.The LeNet-5 network has been compressed by 37.5 times,which is 1.5 times higher than the trial and error method;LeNet-300-100 is compressed 35 times,which is 1 times higher than trial and error;The VGG-16 network and the AlexNet network are compressed 41 times and 34 times,respectively.The compression ratio is increased without loss of accuracy,which makes it possible to apply complex deep neural networks to mobile applications.

Keywords/Search Tags:

sensitivity pruning, deep neural network, k-means++ clustering, compression of deep network

PDF Full Text Request

Related items

1	Research On Compression Method Of Deep Neural Networks
2	On The Learning And Compression Of Deep Neural Network Structure
3	Research And Application Of Structured Model Compression Algorithm In Deep Neural Network
4	Research On Compressing Deep Neural Network Based On Pruning
5	The Study Of Pruning Methods Of Deep Neural Network
6	Research On Deep Neural Network Model Compression Method Based On Parameter Pruning
7	Deep Convolutional Neural Networks Pruning Research Based On Similarity
8	Compression And Optimization On Deep Neural Networks
9	Research On Evolutionary Based Automated Neural Network Compression
10	Research On Compression Method Of Deep Neural Networks Model Based On Parameter Pruning And Sharing