Font Size: a A A

Research On Network Pruning Algorithm Based On Importance Assessment

Posted on:2023-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:J X MaFull Text:PDF
GTID:2558307025462774Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of deep learning,while the performance of neural networks continues to improve,the amount of storage it requires and the computing resources it consumes have also increased significantly,which makes the hardware requirements of neural networks for running devices become more stringent,so the deployment and application on edge devices with poor performance have encountered difficulties.Studies have shown that there are a large number of redundant parameters in the network,which increase the storage consumption of the neural network but reduce the performance of the neural network in the meantime.If this part of the parameter can be removed from the neural network by a proper method,which is called network pruning,the lightweight degree of the neural network can be improved while maintaining the accuracy of the neural network,thereby facilitating its deployment and application on edge devices.When pruning the network,it is common to decide which weights to prune by evaluating its importance.This thesis explores different importance evaluation methods,which are divided into the following three aspects:(1)A soft pruning method based on the lottery hypothesis is proposed.The soft pruning method can restore the network parameters that were wrongly cut in the early iteration process of the pruning algorithm,so as to ensure that the correct parameters are finally cut and the network can get better performance.In order to determine which parameters are mistakenly cut and recover it,a sparse distillation method is designed to transmit the information from the complete network that has not been pruned to the network to be pruned in the current iteration cycle,and the designed network fusion method is used to fuse it with the pruned network of the current iteration cycle,so as to ensure the similarity of the complete network and the unpruned original network as much as possible while ensuring the sparsity of the network,so that the network can still maintain good performance when the pruning rate is high.(2)The importance of parameters was evaluated at the network layer,convolution channel and convolutional column scales.At the network layer scale,the importance of the network layer is determined by calculating the network accuracy after the network layer is pruned and fine-tuned.At the layer scale,global information about channels is obtained by global average pooling,and the importance of each channel is obtained through the sigmoid function.At the parameters scale,the importance of the corresponding parameters is obtained by calculating the correlation of the parameters in each convolution kernel compared to other parameters.Finally,the importance of the parameters is finally evaluated by combining the importance scores on these three scales.It breaks through the previous methods of calculating the importance of network parameters through a single index,and uses multiple indicators to comprehensively evaluate the importance of network parameters,which can better determine the importance of parameters and obtain better pruning effect.(3)Combined with the network architecture search algorithm which is based on differentiation.The different value range spaces to which the parameters belong during pruning are regarded as different search spaces,and each search space is assigned with a corresponding learnable parameters,and in the iterative learning process,the search space corresponding to the smallest learnable parameter is selected cutted from the network in each iteration,repeat this step until the pruning rate of the network reaches the requirements.The network architecture search algorithm is innovatively introduced into the field of network pruning algorithm,and the pruning parameters of the network are determined by the learning method,which is more automated than the method of manually selecting pruning parameters,so as not to rely on expert experience,which is more conducive to the actual deployment of pruning algorithm.Through experiments,it is found that maintain the middle layer of the network,and cutting the largest and smallest parameters can achieve better pruning results.
Keywords/Search Tags:deep Learning, network acceleration, network pruning, knowledge distillation, network architecture Search
PDF Full Text Request
Related items