Font Size: a A A

Research On Small Object Detection Method Based On Deep Learning

Posted on:2022-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ZhangFull Text:PDF
GTID:2518306608990559Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Object detection is one of the research directions in the field of computational vision.The purpose is to use the corresponding detection method to identify whether there is a specific object in the image and return its position in the image.It is widely used in au-tonomous driving,medical diagnosis,satellite images,etc.application.Object detection based on deep learning has achieved rapid development in various fields and has become a research hotspot in recent years.However,in many scenes,objects with small size and small proportion of the image appear.Due to the small amount of information and the complex detection background,these small objects will cause the loss of depth feature in-formation to be difficult to extract,and the detection performance is difficult to improve.Therefore,the detection method for small targets emerges as the times require,which is of great significance to the fields where a large number of small targets exist,such as traffic sign recognition,UAV monitoring,and real-time face detection.Alexey et al.proposed YOLOv4(Optimal Speed and Accuracy of Object Detection)in 2020,which is favored be-cause of its accuracy and efficiency.It is one of the models with excellent target detection performance in recent years,and has also achieved good results in small target detection.However,the model still has some problems that make it difficult to further improve the performance of small targets:(1)A fixed receptive field is used in the network layer,so that the convolution only focuses on objects of regular size and ignores the characteristics of small targets,resulting in low detection accuracy;(2)The model structure is complex,the number of network layers is large,and the amount of calculation is large,although it can extract deep features,it cannot play a better role on some platforms that require real-time deployment.In response to the above problems,this paper proposes a target detection model and a lightweight target detection model that integrates multi-view attention.The main contribu-tions are summarized as follows:(1)A target detection model integrating multi-view attention is proposed,and the de-tection accuracy of small targets is improved by introducing parallel receptive fields and attention mechanisms into the backbone network of YOLOv4.This model introduces the idea of multiple receptive fields into YOLOv4,and em-beds a selective convolution kernel attention mechanism in the residual structure of the CSP module of the backbone network,replacing the previous 3×3 convolution layer,forming a multi-channel visual attention network.MFACSPDark.The improved model enables the network to obtain and integrate the context information of the feature map through different receptive fields,and uses the channel and spatial attention mechanism to assign the inte-grated information to the corresponding convolution kernels with different weights,so as to adjust different feelings.The importance of wildness in extracting features increases the at-tention of target regions with relatively small size and scale.The experimental results show that the YOLOv4 model integrated with MFACSPDark performs better in the detection ac-curacy of small targets,and can effectively avoid false detection and missed detection of targets.On the small target dataset MSCOCO,APSand APMare 2.5%and 3.7%higher than the original YOLOv4 respectively.On the aviation dataset VEDAI,AP50and AP75has a 2.4%improvement,and AP50and AP75have a 0.8%improvement in the PASCAl VOC dataset.(2)Improve the backbone network of the YOLOv4 model and propose a lightweight backbone network Tiny CSPDark.The network uses 3 lightweight cross-stage modules Tiny CSP instead of the original CSP module.Compared with the CSP module,the Tiny CSP module deletes the 3×3 convo-lutional layer responsible for the initial feature extraction,and uses the 3×3 convolutional layer and the 1×1 convolutional layer to replace the residual structure in the CSP module,the 1×1 convolutional layer and the maximum pooling layer are introduced in two different paths,where the 1×1 convolutional layer replaces the 3×3 convolutional layer of the CSP module,and the maximum pooling layer reduces the features The size of the graph further reduces the amount of computation and parameters,and improves the forward propagation speed of the network.(3)The feature fusion network PANet of the YOLOv4 model is improved,and the feature attention pyramid network CBAMFPN is proposed.The network uses FPN as the feature extraction module,which retains the initial fusion part compared to PANet,ensuring the fusion of deep semantic information and shallow representation information.The deep fusion part is removed,subsequent downsampling and stacking operations are reduced,the number of convolutional layers is reduced,and the model response speed is improved.After each feature fusion layer of FPN,the CBAM attention module is added to enhance the information representation through the attention mechanism,and the channel information and spatial information are integrated into the feature map to extract important features,so that the CBAMFPN network can improve the detection performance while simplifying the structure,achieving A balance of precision and speed.(4)On the basis of work(2)and(3),the lightweight backbone network Tiny CSPDark and the feature attention pyramid network CBAMFPN are fused,and a lightweight target detection model LWYOLOv4 is proposed.Compared with YOLOv4,the pruning contains multiple feelings The wild and complex SPP module,the model is more streamlined.The experimental results show that compared with the baseline model YOLOv4,LWYOLOv4has lower complexity and faster detection speed.At the expense of a small amount of accu-racy,the amount of parameters is reduced by 23%,the amount of calculation is reduced by22%,and the speed(FPS)is Great improvement,more suitable for deployment on platforms with high real-time requirements.
Keywords/Search Tags:Small Object Detection, YOLOv4, Deep Learning, Attention Mechanism, Light Weight Model
PDF Full Text Request
Related items