| With the development of remote sensing technology,the object detection of remote sensing images has been more and more widely used in marine transportation,urban construction,military monitoring and other fields.It is a research front and hot spot in the field of object detection.In recent years,due to the good feature extraction and expression capabilities,deep learning models have dominated the fields of object detection for natural images and achieved exciting detection results.However,the targets in remote sensing images are very different from natural images.There are problems such as large size differences,goalintensive,small scale,and multiple orientations in aerial targets.Object detection algorithms for natural images are not suitable for detecting aerial targets directly.The main research content of this thesis is to improve the detection performance of aerial targets in the context of deep learning,and display the detection results at high speed.The main work and contributions of this thesis are as follows:(1)Collected and manually annotated more than 10,000 remote sensing images for object detection.Object detection algorithms based on deep learning require a large amount of annotated data to learn.But in the field of remote sensing images,annotated data is insufficient to support the training of deep learning algorithms.One of the main tasks of this thesis is to collect a large number of remote sensing images and manually annotated aerial targets.We finally make a remote sensing image dataset which contains 7 different categories,11,078 images,and 54,312 targets.(2)Found out the backbone network which is suitable for object detection algorithm to aerial targets.With the continuous development and improvement of CNN(Convolutional Neural Network)architecture,the feature expression capabilities of deep learning algorithms continue to increase.According to the characteristics of aerial targets,through theoretical analysis and experiments,we select Res Net34,which can fuse context features in the backbone network,and have more efficient feature extraction as the backbone network for aerial object detection algorithms.(3)Proposed a more efficient way of context semantic feature fusion.There are many small targets in remote sensing images that require more efficient feature to detect.This article combines FPN(Feature Pyramid Networks)to improve the SSD(Single Shot multibox Detector)algorithm,using deconvolution to increase the size of feature map,and increase the number of convolutional layers between different scales to improve the expression ability of feature map,so that improving detection capability for small aerial targets.At the same time,feature maps of different scales are densely connected,which further improves the ability of feature expression and targets location.(4)Used Focal Loss to balance the number of positive and negative samples during training,and use k-means algorithm to generate the size of prior boxes that suitable for aerial target detection.SSD solves the problem of large difference between the number of positive and negative training samples but ignoring the training loss of easy-to-classify samples.Focal Loss allows all samples to participate in training and improves the learning ability of algorithm.Besides,in order to reduce the usage of prior boxes,the k-means algorithm is used to cluster all training samples to generate prior boxes that are more suitable for detecting aerial targets.(5)Designed and implemented a high-speed display system for remote sensing images.We use Open GL and multi-threading technology to make full use of GPU’s rendering and CPU’s multi-core processing capabilities to achieve high-speed display of remote sensing images. |