| Object detection is an important area of computer vision.Based on whether a priori anchor box is used,object detection can be divided into anchor-based and anchor-free object detection algorithms.The anchor-based object detection algorithm often has problems such as lack of generalization ability and low robustness in the setting of the anchor box,while the anchor-free object detection algorithm does not need to set the hyperparameters of the anchor box,and has strong robustness.The attention mechanism is one of the focuses of research in recent years,and the global feature relationship can be effectively constructed through the attention mechanism.This thesis takes the anchor-free object detection algorithm that introduces the attention mechanism as a starting point,and makes improvements to solve the problems of low detection accuracy of targets and slow convergence in the network.The specific research contents are as follows:1.For the problem of low detection accuracy of targets,this thesis proposes two improvement schemes based on multi-scale features of images: the first one is based on multi-scale feature detection,which uses multiple features of different scales for detection.After summarizing,the final detection results are output,so as to comprehensively utilize the deep and shallow feature information; the second is to improve based on the deformable convolution and dilated convolution,by removing the down-sampling stage of the backbone,and using these special convolutions to improve the receptive field,this provides the detection network with features with a larger scale and richer information about small targets.2.In view of the long training time and slow convergence of the original network,this thesis analyzes the attention mechanism in the detection network through visualization,and finally locates the slow convergence due to the cross-attention module through the method of elimination.Therefore,this thesis improves the cross-attention module in the original network by giving certain information to the input query,introducing reference points and changing the generation method of the prediction,and using concatenate operations instead of addition operations to reduce network confusion.It greatly improves the convergence speed of the network and improves the detection accuracy.The experimental results in this thesis fully prove the effectiveness of the above improvements.The research done in this thesis has achieved a 4.0% AP accuracy improvement on the COCO2017 dataset,among which the small target average accuracy has achieved a 6.3% AP improvement,and the number of cycles required to train the network to convergence has been reduced by 80%... |