| Nowadays popular object detection algorithms are usually based on convolutional neural networks,which are difficult to be deployed on platforms with limited computational resources such as embedded platforms due to the limitation of computational amount and large number of parameters.However,with the increasing demand for object detection task in the industry,the detection speed of the common object detection model is not up to the standard.Although the computing power of embedded platform is developing rapidly in recent years,the large amount of computation and the large number of parameters of object detection algorithm are still the main factors that restrict the practical application of object detection algorithm.Therefore,it is of great significance to study the model compression algorithm of object detection algorithm for the application of object detection algorithm in industry.In this thesis,an object detection network suitable for embedded platform deployment is optimized and designed through in-depth study of object detection algorithm.On the basis of optimized network,a object detection network compression algorithm based on pruning and quantization is proposed,which can greatly compress the model size and improve the reasoning speed of the model while maintaining the accuracy of the model.Experimental results on Jetson Xavier NX embedded platform show that the compression algorithm based on pruning and quantization is effective.In order to solve the problem that the popular object detection network is not suitable for embedded platform deployment due to the large amount of computation and the number of parameters,a object detection network suitable for embedded platform deployment is designed.On the basis of in-depth analysis of different object detection algorithms,a variety of object detection optimization methods are used in dataset augmentation,weight optimization,network architecture design and other aspects.By referring to the successful design method of object detection network,an object detection network suitable for embedded platform deployment is designed,and the performance of the optimized network is verified by comparing it with other object detection algorithms on COCO dataset.Aiming at the problem that traditional channel pruning method has great influence on model accuracy and poor sparsity effect,a progressive sparsity training method was proposed.In this method,the channel is divided into three parts: pruning part,undetermined part and reserved part,and different sparse strategies are adopted for each part in the sparsity training process.Compared with the traditional sparsity training method,the proposed method has less impact on the accuracy of the model and produces better sparsity effect.To verify the effectiveness of the channel pruning model compression algorithm,channel pruning experiments were carried out on Pascal VOC datasets and Wider Face datasets for the optimized detection network.By analyzing the performance of the pruning model in terms of accuracy,calculation amount,number of parameters and inference latency under different pruning ratios,a pruning model that can balance accuracy and inference speed was selected,and the effectiveness of channel pruning algorithm was verified.Aiming at the problem that the model compression effect of single model compression method is limited,a object detection network compression algorithm based on pruning quantization joint is proposed.This method applies quantization on the pruned model,and the combination of pruning and quantization can further improve the compression ratio of the model.By analyzing and comparing the advantages and disadvantages of the post-training quantization method and the during-train quantization method,the post-pruned model was quantified by during-train quantization method.In order to verify the effect of the compression algorithm based on pruning and quantization,quantization experiments were carried out on Pascal VOC datasets and Wider Face datasets.The experimental results show that the compression algorithm based on pruning and quantization can achieve higher compression ratio and maintain better model accuracy compared with the single model compression algorithm.The compression ratio of the proposed algorithm on VOC dataset can reach 15.24,the inference delay is 9.12 ms,and the accuracy loss is only 3.89%.The compression ratio reaches 51.57 on Wider Face dataset,the inference delay is 7.58 ms and the accuracy loss is only 2.42%. |