Font Size: a A A

Research On Compression Technology Of Object Detection Model Based On Deep Learning

Posted on:2021-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2518306503972699Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Thanks to the improvement of hardware computing capabilities,deep neural networks are more and more widely used in computer vision tasks.In order to extract more features and to be more robust,neural networks tend to be deeper and deeper,resulting in more and more parameters.Deep neural networks can easily be trained and deployed on desktop GPUs like NVIDIA1080 Ti.But mobile devices such as smart phones and drones have the characteristics of weak Compute Capability,small bandwidth,and small storage space,which makes the deployment of large-scale deep learning models face great challenges.Therefore,compressing the model and effectively deploying it on a hardware platform with limited resources has become an important research direction in the academic world.However,most of the model compression methods are only for image classification models.Their effect in tasks such as object detection needs further study.The focus of this article is on efficient compression methods for object detection models.This article first introduces the development trend of deep neural networks design and common model compression methods.Aiming at the commonly used object detection model YOLO,a compression method based on LASSO regression is proposed,which is better than manual tuning method.For the embedded platform NVIDIA Jetson TX2,this paper proposes a lightweight object detection model based on depthwise separable convolution and feature map fusion,and further compresses the model using the proposed pruning method.The resulting model size is only 6.8MB.The model runs at 18 FPS when doing object detection on 1080 P video on TX2 platform,and can be applied to some embedded object detection applications.For the task of table text detection on mobile phones,this paper first analyzes the bottlenecks that slow down the detection speed of PSENet,then replaces conventional convolution by dilated convolution to increase receptive field and further uses depthwise separable convolution and group convolution to reduce parameters.To improve segmentation result,we introduce Pyramid Pooling Module and Feature Pyramid in the backbone network.Given the complexity of the task,we optimize the post-processing procedure.Finally,the model was converted to MNN format and INT8 quantized to be efficiently deployed on i Phone,realizing fast text detection.
Keywords/Search Tags:Deep Learning, Object Detection, Model Compression
PDF Full Text Request
Related items