| Nowadays,the demand for intelligence is becoming more and more urgent,and object detection has received widespread attention as a basic task of computer vision.Accurate classification and positioning of interest targets in images provide important technical support for unmanned driving,unmanned retail and other scenarios,and object detection also provides an important research foundation for video understanding.As a multi-task learning process,object detection usually requires better features than classification task.Convolutional Neural Networks also play a huge advantage in this aspect,and it contributes to the high development of computer vision including object detection.The work done by this paper based on this is as follows:1.Combining the research background,significance and development status of object detection at home and abroad,and a brief overview of the traditional classic object detection algorithm,this paper classifies and summarizes the object detection algorithm based on deep learning.By introducing the development history and relevant theoretical knowledge of convolutional neural networks and object detection in detail,the two types of deep learning object detection algorithms,one-stage and two-stage,were further discussed in depth,and their applications and development trends are compared and analyzed.2.Detectors based on multi-scale features to predict objects of different sizes and ratios have greatly exceeded the performance of detectors based on single-scale features.At the same time,the feature pyramid structure is used to construct high-level semantic feature maps at all scales,thereby further improving the performance of the detector.However,such feature maps do not fully consider the supplementary effect of context information on semantics.In this paper,based on the SSD benchmark network,a fusion method of neighboring feature layers is proposed,and the structure of the fusion module is carefully designed to make full use of context information.3.In order to make the fused features more directed at different scales,this paper recognizes the important role of the attention mechanism in the field of computer vision.It suppresses background noise while highlighting the key information of the feature map.It is a top-down residual attention mechanism to advance the feature fusion process.The spatial attention mechanism adopted in this paper is different from the channel attention mechanism,which is more conducive to highlighting the spatial contour and position of objects in the object detection task.The feature fusion method with residual attention mechanism enhances the representation ability of the prediction feature map,and improves the detection accuracy at different scales. |