Font Size: a A A

Research And Application Of Object Detection Model Based On Improved YOLOv3

Posted on:2023-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:M T GuoFull Text:PDF
GTID:2568306836974069Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Convolutional neural network based on deep learning is one of the most popular methods for target detection.As an important branch of computer vision field,target detection is widely used in automatic driving,industrial detection and other production activities.Through target detection algorithm,it can significantly save labor and material costs and improve production efficiency.Object detection is a prerequisite for instance,panorama segmentation and other high-level computer vision tasks.As one of the most popular target detection algorithms,YOLOv3 has strong generalization ability.However,the current algorithms involved in the autonomous driving field have problems such as poor accuracy and difficult deployment,so this thesis will study the target detection algorithm from two aspects of improving detection accuracy and compression model parameters,and improve it based on YOLOv3.The main work of this thesis is as follows:(1)In terms of improving the detection accuracy of the model,spatial pyramid pooling is firstly adopted in this thesis to fuse the local features and global features in the image,so as to better enrich the expression ability of the feature graph and detect the targets with large size differences in the image more effectively;Secondly,an attention mechanism is added to the feature graph to weight each channel to enhance the key features and remove the redundant features,so as to improve the discrimination ability of the feature network to the target object and background.Finally,the anchor box obtained by k-means clustering algorithm and GIoU loss function is fitted and the final prediction box is obtained to complete the positioning and recognition of target vehicles and pedestrians.Experimental results show that the proposed method achieves 91.4% m AP(mean accuracy),83.2%F1 score on KITTI data set,and 45.3FPS(frames per second).The detection performance is better than that of traditional YOLOv3 in both accuracy and speed.(2)In terms of model compression,this thesis firstly adopts G-Module combined with DepthWise convolution to construct the trunk network of the whole model,and adds attention mechanism to the trunk network to perform weighted operation on each channel to enhance key features and remove redundant features,so as to improve the discrimination ability of feature network to target objects and background.Secondly,the scaling factor Gamma in the Batch Normalization layer is used to cut down the channels to compress the model size and improve the calculation speed.Finally,the model transformation and semi-precision acceleration are carried out based on NVIDIA Tensor RT framework,and the accelerated model is successfully deployed on the embedded platform Jeston Nano.Experimental results show that on KITTI data set,the reasoning speed of the method proposed in this chapter is about 5 times that of the original model,and the parameter volume is reduced by one tenth.(3)The prototype system of vehicle and pedestrian target detection based on the improved YOLOv3 model is designed and implemented.The software and hardware conditions of the prototype target detection system are analyzed.The improved YOLOv3 model proposed in this thesis is implemented and deployed.The recognition performance test of the system under relevant data sets is given to verify the feasibility and effectiveness of the algorithm.
Keywords/Search Tags:Object Detection, Spatial Pyramid Pooling, Attention Mechanism, Half-precision Acceleration, Model Pruning
PDF Full Text Request
Related items