| In the field of deep learning model compression,model pruning is a popular and efficient algorithm,which can greatly reduce the amount of computation and parameters of the network model under the premise of ensuring the accuracy of the network,improve the inference speed of the network,and reduce the memory proportion of the network.In recent years,in order to improve the accuracy of the model,the convolutional neural network has evolved in a deeper and wider direction.While the efficiency of feature extraction has been improved,the large amount of calculation and parameters of the network have brought difficulties to the deployment of the network,which is concentrated on some embedded platforms with very limited hardware resources.In order to alleviate the pressure of increasingly large networks and increasingly difficult deployments,the field of model compression has developed rapidly in recent years.This paper takes structured pruning as the starting point.In order to make the network achieve a higher compression rate without reducing the accuracy,this paper first improves the popular channel structured pruning algorithm to cut out more redundancy.channel,and then,in view of the limitations of the traditional channel structured pruning algorithm,the pruning center of gravity is moved from the channel to the inside of the convolution kernel,and the regular fine-grained structured pruning is performed while the coarse-grained channel pruning is performed.This makes the pruning algorithm more accurate and efficient.Finally,the accelerated vehicle pedestrian detection system is successfully deployed on the embedded platform.The main research results are:(1)Aiming at the imperfection of traditional channel pruning,which only considers the variance factor of the batch standard layer for pruning,and comprehensively considers the influence of the expectation factor,variance factor and activation layer Re LU of the network approval standard layer,a channel pruning scheme based on standard layer and activation layer Re LU is studied.This scheme uses the characteristics of the Gaussian distribution to predict that most of the parameters of some channels will be set to 0 after passing through the activation layer Re LU.Pruning these channels can reduce the size of the model by subtracting redundant feature channels without affecting the network expression performance.The effectiveness and versatility of this method are verified by doing a lot of experiments on two mainstream networks on different data sets.For example,on CIFAR-10,for VGG-16,compared with the baseline network,the accuracy rate is increased by 0.04% when the calculation amount is compressed by 63.4%.And for Res Net-56,In the case of the calculation amount is compressed by 71.2%,the accuracy rate only dropped by 0.14%.(2)Aiming at the rigid requirements of feature map dimension matching in the network structure,such as that the number of convolution kernels of the last convolutional layer on the backbone of the residual structure in Res Net and the the number of last convolution kernels on all branches before the cascading operation in the Inception network cannot be changed,a multi-granularity neural network pruning method under the regularization mechanism is studied,and a multi-granularity pruning strategy from coarse to fine is designed,which maintains the sparseness at the same time when the number of convolution kernels in the convolutional layer where the dimensions match the position does not change.Moreover,this scheme proposes a sparse method of adaptive L1 regularization,which can make the network take into account the change of the network structure while updating the parameters.The sparse convolution kernel not only has fewer parameters and calculations than the original convolution kernel,but also has better structural properties,making the network have higher expressive capabilities.For example,on CIFAR-10,for VGG-16,compared with the benchmark network,the accuracy rate increased by 0.19% when the calculation amount was compressed by 76.73%,and for Res Net-56,the calculation amount was compressed by82.54% in which condition the accuracy rate only dropped by 0.14%.On Image Net,for Res Net-50,the accuracy rate only drops by 0.48% when the calculation amount is compressed by 56.95%,which is better than the existing advanced pruning methods.(3)The pruned vehicle and pedestrian detection system was built on the NVIDIA embedded device Jetson NANO.Firstly,the Yolov3 network is trained on the server before pruning based on the PASCAL VOC 2007 dataset to obtain the benchmark model.Then use the same data set and network,join the multi-granularity neural network structured pruning algorithm for training,and obtain the pruned model.Finally,build the Qt system on the embedded device,deploy the pre-pruned and post-pruned networks and conduct comparative tests in offline and real scenarios respectively,verifying the effectiveness of the device-side deployment of the research scheme of this paper. |