Font Size: a A A

Research And Application Of Real-Time Object Detection Algorithm Based On Deeping Learning

Posted on:2022-10-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q ZhangFull Text:PDF
GTID:2518306605473444Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The emergence of deep learning has enabled computer vision to develop vigorously.As the most basic task in computer vision,object detection has been widely used in various fields such as autonomous driving,intelligent monitoring,and intelligent medical care.In recent years,with the popularity of social media networks and mobile/embedded devices,the amount of video and image data has increased dramatically,which has led to an increasing demand for visual data analysis.How to balance the speed and accuracy of algorithms,especially the realization of real-time object detection algorithms suitable for mobile lowend GPU devices has become a hot topic of current research.This paper researches and improves the real-time target detection algorithm YOLOv5 s,improves the accuracy of the algorithm on the basis of ensuring real-time performance,and applies the improved algorithm to embedded devices to realize real-time detection in different scenarios.The work of this paper is summarized as follows:1.Aiming at the problem of insufficient accuracy of the benchmark algorithm for small target detection,this paper proposes a feature fusion network based on adaptive receptive field enhancement(ARFA)and multi-scale fusion attention mechanism(MFA).First of all,the adaptive receptive field enhancement module(ARFA)uses spatial pyramid convolution to fuse the characteristics of different receptive fields.At the same time,in order to reduce the amount of parameters,it uses hole convolution with different expansion rates to replace ordinary convolution.In addition,for each convolutional layer,the characteristics of different receptive fields contribute differently to the network.In this paper,a gate mechanism is used to allow the network to adaptively learn the weights of different receptive field branches according to the input feature map,and then perform weighted fusion of each branch to output.Secondly,the multi-scale fusion attention mechanism(MFA)is used to model the correlation between the channels of different resolution features during fusion,and to automatically obtain the importance of each feature channel at different scales through network learning.Finally,this paper combines ARFA and MFA to propose an enhanced path aggregation network(EPANet).ARFA module is added to the three-scale prediction layers to extract more fine-grained features,and then merge top-down and bottom-up.The branches are all fused through MFA,so that the features of different scales and different receptive fields can be fully and effectively fused,providing richer and effective feature information for the prediction network,and improving the detection effect of small targets.2.Aiming at the problem that the huge hyper-parameters in the algorithm need to be manually set,this paper uses a hyper-parameter search method based on genetic algorithm and hyper-parameter evolution to search for the necessary hyperparameters in the algorithm to obtain the optimal hyperparameter combination.It can improve the accuracy of the algorithm without consuming extra inference time.3.Based on the improvement of the YOLOv5 s algorithm,this paper converts the model trained on the Pascal VOC data set into a TensorRT model through the ONNX framework,and deploys it to the Jetson Nano embedded development board with a low-end GPU after quantitative acceleration.Using the camera to read in the video or directly input the video or image data,by controlling the input parameters to switch tasks such as natural scene detection,pedestrian detection and vehicle detection,to achieve a mobile and flexible realtime multi-target detection system,which has certain engineering value.
Keywords/Search Tags:deep learning, multi-target detection, real-time, embedded device
PDF Full Text Request
Related items