| Object detection is one of the basic problems in the field of computer vision.Due to the slow detection speed,insufficient generalization ability and complex manual feature design of traditional methods,object detection has not been widely applied in the actual scene for a long time.Recently,deep learning has made a breakthrough in many research area because of its generalization ability,such as Computer Vision,Natural Language Process,Speech Recognition,etc.Nowadays,almost all walks of life are exploring how to apply deep learning technology in related fields.In the research field of object detection for drone-captured images,there are two main factors hindering its development.Firstly,the scale of the objects is very small,object detection algorithms based on deep learning usually use the feature extraction network with large down sampling factors to obtain larger receptive field and higher semantic features.However,due to the limitation of small objects’pixels,the feature information of small objects often disperses or even disappears in deeper feature maps and result in poor performance.Secondly,The scene to be detected is complex:in drone-captured images,there are a lot of interference around the objects,such as buildings,trees and some similar objects,which will affect the detection performance of the detector and reduce its detection accuracy.The above two problems greatly affect the detection accuracy of drone-captured image.Therefore,compared with the common conventional image,object detection in aerial image is more difficult and more challenging.In view of the above problems,the research contents of this paper are as follows:(1)In order to solve the problem that the small object to be detected leads to the loss of its feature information.This paper takes the characteristics that shallow feature maps have rich features for small objects.Based on the feature fusion,we presents Multi-branch Parallel Feature Pyramid Network(MPFPN)to enhance the feature extraction ability for small objects,and improve the detection accuracy.(2)To solve the issue that complex scene generate a lot of interference for the detection in drone-captured images,this paper adopted the attention mechanism based on the proposed MPFPN to allocate the importance of feature map in the space level and channel level,which is beneficial to the problem of complex image background.Specifically,we combined the channel attention module(CAM)and spatial attention module(SAM)after the C5feature map,which effectively improves the detection ability in complex background.(3)In the training stage,a data enhancement method called multi-scale training is used to train the datasets.We set a group of different short side length of the image in advance and keep the image aspect ratio,which can notably improve the generalization ability for different scale objects.In the process of model testing,the weighted boxes fusion(WBF)is introduced into the multi-scale testing to fuse the detection results in different scales of images.Besides,we use this method to fuse the detection results of different backbone networks,and utilize the complementary advantages for different backbones to further improve the accuracy of the proposed algorithm. |