| In the fields of public service and military defense,target detection technology based on aerial optical imaging is of great research value.It can be widely used in urban construction,disaster monitoring,and precise attack of tactical weapons.In recent years,there have been many applications and researches in the fields related to aerial image target detection.Most of the traditional detection methods are focused on three types of methods: sliding window search,artificial feature selection(SIFT,HOG,etc.)and shallow learning features.These traditional detection methods have their own shortcomings,which can be summarized as follows:(1)Feature extraction is difficult,and traditional detection methods often use the method of manually designing features when performing feature extraction,which is very difficult to design.The process of feature extraction is complicated and cumbersome.(2)The calculation is complex and redundant.The sliding window method requires a global search for the entire image,which results in low efficiency.Too many redundant windows are generated during the search process.(3)The detection accuracy is low.For example,the support vector machine discriminator widely used in traditional target detection methods cannot deal with large-scale training.The more samples,the worse the classification performance.The multi-classification problem of target detection needs to be constructed repeatedly.Classifier,classification accuracy is extremely sensitive to the selection of parameters and kernel functions.In recent years,with the rapid development of computer vision and deep learning technology,the target detection algorithm based on deep learning has achieved significant improvement in performance.However,if the target detection algorithm based on deep learning is directly modified and applied to the target detection task of aerial optical images,there are still many problems.The problems are mainly reflected in the following aspects:(1)aerial image imaging taken at different angles There are significant differences in characteristics.According to the shooting angle,aerial optical images can be divided into oblique images and vertical images.Due to the different shooting postures,some objects in the image will be misaligned or blocked,resulting in missed detections and false detections.The phenomenon occurs frequently.(2)In aerial optical images with a large field of view,the background elements in the image are often complex,and the targets exhibit multi-scale features.The pixels of some targets are very small,making it difficult to quickly and accurately locate and identify them.(3)In the practical application of earth observation,when the target detection is performed on the optical image with a large field of view,the speed and accuracy of processing images are very high.When improving the design algorithm,it is necessary to consider meeting the requirements of accuracy,but also Try to avoid increasing the complexity of the algorithm.In order to solve the above problems,this subject has carried out research on the detection of aerial optical image targets based on Mask R-CNN.In this paper,based on the original algorithm,in order to improve the detection performance of the model,solve the problems that arise in practical applications,and improve the visual effect of detection,we have conducted in-depth research on target detection and aerial imagery.main tasks as follows:(1)Analyze and summarize the development process of the target detection algorithm,from the traditional target detection algorithm to the evolution process of the target detection algorithm based on deep learning.By analyzing the advantages and disadvantages of each method,the latest research results of many scholars are studied,and the following conclusions are verified: the detection algorithm of deep learning based on regional recommendation networks has higher performance,and the Mask R-CNN model selected for this topic is also discussed.the reason.(2)This topic mainly studies the principle structure of Mask R-CNN algorithm and the process of model building,and compares and analyzes each part with the corresponding structure of Faster R-CNN.First,introduce and analyze the main structure of Mask R-CNN;then,elaborate the process of making the dataset used for Mask R-CNN training.Finally,the advantages and problems of applying Mask R-CNN in aerial image target detection are described.(3)In order to solve the practical problems of Mask R-CNN algorithm when detecting aerial image targets,this paper proposes two improved algorithms for aerial images with different shooting angles based on the original algorithm.This paper proposes a number of improvement strategies based on the original algorithm for the characteristics of tilted aerial optical image imaging.First,a new fusion branch is added to the original feature pyramid structure,so that the feature map can express richer scale and depth feature information;At the same time,an online difficulty mining mechanism is introduced to improve the accuracy of the network and solve the accuracy bottleneck problem caused by the imbalance of the training positive and negative samples.Finally,a multi-component combination detection method is used to eliminate the false detection targets caused by visual errors caused by the oblique shooting posture.This article also proposes a number of improvement strategies for the characteristics of vertical aerial optical image imaging.First,a multi-spectral fusion network is added to merge visible and infrared images to solve the phenomenon of missed detection and false detection caused by the target being blocked;then the feature pyramid is further improved Structure;Finally,a feature suggestion network that relies on scale changes is proposed to further improve the detection accuracy of targets of various sizes.In order to verify the effectiveness of the improved methods proposed for the detection of two types of aerial image targets,test experiments were conducted on different training sets to fully verify the effectiveness of the improved methods proposed in this paper. |