With the development of Convolutional neural network technology in deep learning,the model structure of today’s Convolutional neural network is more widely used.In order to improve the performance of the model,the network model usually becomes more complex and increasingly large,and the neural network itself has computational redundancy,making Edge device unable to meet the computational requirements of complex models.Therefore,the model compression of deep Convolutional neural network has been widely studied.Using effective model compression algorithms can reduce redundancy and make complex models become lightweight models to adapt to richer application scenarios.In this paper,lightweight Convolutional neural network and structured model pruning technology are studied.Firstly,manually design lightweight convolutional modules to construct neural network RDPNet.The main method is to construct an RDP module based on deep separable convolution and reparameterization methods,with multiple structural modules for training and reparameterized single structural modules for inference.On this basis,the overall network structure is constructed with appropriate depth and width adjustments.The experimental results show that RDPNet outperforms other lightweight neural networks of the same type,achieving a good balance between model performance and inference speed.Then,we improved and proposed the Global Adaptive Pruning method,which is a structured dynamic pruning method.Using sparse training to obtain a model with sparse solution,the importance of the channel is judged based on the training mask,and then redundant channels are pruned to compress the model.The advantage of using this approach is that the algorithm focuses on exploring the implicit architecture of the model,and dynamically judges the importance of channels in the training state to achieve better compression results.Finally,the proposed two network compression methods are applied to the Edge device environment.By using the INT8 quantization method,we further compress the model size and accelerate the calculation based on the first two methods.By reasoning the compression model on the Edge device,the validity of the compression method proposed in this paper is verified. |