| It is difficult to deploy deep neural network in embedded devices with limited resources because of its large size.At the same time,it has serious redundancy and high complexity.Therefore,the efficient compression and acceleration of deep neural networks is of great research value.In order to solve this problem,this paper proposes the use of particle swarm optimization to optimize deep neural network compression.The deep neural network compression method proposed in this paper is divided into four steps:In the first step,the pre-training model uses a sparse-based pruning method to remove its redundant weights;In the second step,the K-means++ method is used to cluster the weights of the network layers after pruning.The original weights are replaced by the central values obtained by clustering.The layers share the center weights and compress the network scale;The third step is to quantify the weights of each layer after clustering.and the quantization process reduces the bit bits needed to represent the weights,thus realizing the deep neural network;In the fourth step,particle swarm optimization algorithm is used to fine-tune the network parameters to avoid falling into the local optimal solution and restore the accuracy of the network.In this paper,compression rate and accuracy are obtained by compression experiment in LeNet-300-100,LeNet-5 and VGG-16 network.On the premise that the accuracy is not decreased or even improved,the compression method can achieve 30 to 40 times the compression effect,which enables the deep neural network to be deployed in embedded devices. |