| UAVs have gradually replaced humans in completing dangerous and difficult missions in the air by taking advantage of their good adaptability,survivability,low cost,and high efficiency.Having excellent visual capabilities is a necessary condition for UAVs to analyze scene information,make timely adjustments,and ensure the completion of tasks.With the continuous development of computer vision,target detection has become one of the core technologies for UAV applications.However,UAV images are mostly top views,and the contained targets have the characteristics of small size,blurred features,disordered distribution,and easy occlusion.Therefore,the challenges faced by this type of small target detection technology are: 1)A small target detection method may include modules with multiple structures for comprehensive detection,and how to reasonably fuse and decide the results generated by different modules;2)The resolution of the UAV image dataset is low,and the small target information contained in it is not obvious.It is easy to cause the problem of missing small targets during detection;3)There is a sample imbalance between different types of small targets,which leads to the failure of the model.Training imbalance problem.This paper focuses on the research of small target detection technology based on UAV images and proposes a series of new models and new methods,which greatly improve the performance of small object detection.The main content of the paper is as follows:1)Small object detection based on prediction information fusion of multiple decision-making layers.A multi-decision layer prediction information fusion method is proposed.Adding the fast dimensionality reduction classification module assists the single-level detection branch,and shares the decision-level prediction information with the single-level detection branch so that the feature-level multi-instance detection module produces reliable results.First,divide the parts that have not reached the confidence threshold.In this part,the feature vector provided by the noise reduction sparse autoencoder is calculated,and the classification is performed again.Secondly,based on experience,it is revealed that a single test result is not reliable enough,and multiple decision-making information of the target needs to be considered.On this basis,the prediction result of the single-stage detector and the prediction result of the classifier are combined and sorted by importance.Finally,the corresponding confidence is given to the top-ranked instances,making it possible to become objects again.The experimental results show that the prediction information fusion method of multiple decision-making layers improves the performance of the professional decision-making surface model.Its average m AP value is improved by 2.7%on the Vis Drone dataset,1.0% on the UAVDT dataset,and 1.8% on the MS COCO dataset.2)Small target recheck method based on dual neural network.An efficient re-inspection method is proposed,which quickly screens the missing targets in the single-stage detection and achieves high-quality detection of small targets through the secondary feature classification of the suspected target area.First,a single-stage detector detects UAV pictures,and the result with a confidence level greater than the threshold is detected as a target.Results less than the threshold are considered as suspected areas containing missed targets.Secondly,VGG extracts the feature map of the UAV picture,combined with the location information of the suspected area to be re-identified.Then,the two features are post-fused,and the re-identified results guide the initial confidence weighting.After weighting,the area with a confidence level greater than the threshold is considered as the target.Finally,the initial and secondary detection targets are integrated to obtain the final result.Experiments on the Vis Drone dataset show that its m AP value exceeds the single-level detection model by 4.03%.3)Small object detection based on category attention mechanism.A multi-cue adjustment method is proposed,which includes dynamically defining samples,screening fuzzy samples,and focusing on training high-quality samples.This method counts various information of the sample and further explores the relationship between the global and local samples.According to statistical sample information,positive and negative samples are dynamically defined,fuzzy samples are screened by judging the relationship between positive samples and surrounding correct samples,and high-quality samples that make outstanding contributions to detection performance are focused on training.First,a limited number of positive and negative training samples are dynamically selected in each picture by comprehensively considering the area,center point,length and width,and other characteristics.Secondly,the detector can easily confuse the occluded target.In this regard,the coincidence of positive samples between different targets is considered,and fuzzy targets are selected as negative difficult samples.Finally,each sample is neither independent nor has the same degree of contribution to the category.Therefore,the samples with the highest group rankings are weighted according to the target category and attributes.Experimental results show that this method achieves a m AP of 50.9% on the MSCOCO dataset,48.6% on the Vis Drone2020 dataset,and 85.0% on the Pascal VOC2007/2012 dataset. |