Aiming at the problem of infrared aerial object detection in complex environment,this paper constructs an infrared aerial object detection algorithm based on deep learning.Since the traditional infrared object detection algorithm relies on manually set rules,once the preset rules are exceeded in practical applications,the detection effect will be greatly reduced.Therefore,this paper studies a regression analysis-based deep learning convolutional network with self-learning ability,that is,deep learning single Stage object detection algorithm.At the same time,the training of infrared object detection algorithm based on deep learning requires a large number of infrared samples,but the cost of infrared sample collection is high and data is lacking.Therefore,this paper designs an infrared sample generation algorithm based on generative adversarial network.Firstly,aiming at the problems of low semantic of low-level features and low resolution of high-level features in deep convolution network,a single-stage infrared air object detection algorithm based on multi-scale feature layer fusion is proposed.The Integrated Attention Module(IAM),which combines channel and space Attention sub-modules,was introduced into the feature extraction network of singlestage object detection algorithm YOLOv5 to enhance the ability of air target feature expression.At the same time,a two-path feature fusion network is proposed to replace the fusion network PANet of YOLOv5.The characteristics of the fusion network are as follows: remove the node with only one input edge and no feature fusion,which causes its small contribution to the fusion network to simplify the network.Add jump connection between non adjacent input and output nodes in the same scale,which can further strengthen the high-level and low-level feature fusion without increasing the amount of network calculation.It is noted that the contribution of different scale features to the fusion network is often unequal.The fusion network adds a weight obtained through network training for each input feature layer to represent the importance of different input features to the fusion output results.The two-way feature fusion network is regarded as a functional layer with repeatable structure to realize higher-level feature fusion.Compared with the original YOLOv5,the improved algorithm mAP improved by 5.55% and 19.7% after verification on the ground-space background data set and infrared aerial data set.Secondly,because the accuracy of object detection and positioning is affected by the boundary box regression loss function,a boundary box regression loss function IAIoU(Included Aspect-ratio IoU)based on IoU(Intersection over Union)is designed to construct two optimization items.The sum of the difference between the union and intersection area of the prediction box and the labeled box and the ratio of the smallest closed area of the two boxes and the ratio of the square of the smallest closed area of the two boxes is used as the first optimization term,which avoids the loss function degenerating when two boxs are included.The difference between the aspect ratios of the two boxes is adopted as the second optimization term to generate a prediction box that is closer to the labeled box.The frame regression loss function of the single-stage detection algorithm YOLOv3 is replaced with the designed IAIoU loss function,and verified on the infrared air data set,the mAP reaches 92.17%,which is 1.37% higher than the original YOLOv3.Finally,since a large number of infrared samples are required for deep learning network training,an infrared sample generation algorithm based on Conditional Generative Adversarial Network(CGAN)is proposed.Taking the segmented image of infrared air target as the condition label,the structure of generator and discriminator in the model is designed.In view of the loss of information in the down-sampling process of the traditional generator structure encoder,resulting in insufficient detailed information and blurred images when the decoder restores the image,a skip connection is added between encoding and decoding to connect the i-th layer and the n-i-th layer in the network.The layers are connected(n is the total number of layers in the network),so that it can perceive more detailed information and generate infrared samples with high simulation degree.Aiming at the problem that the original generative adversarial network discriminator outputs only one true or false evaluation value for the whole image,ignores local detail information and is prone to extreme evaluation,the image is divided into multiple area blocks of fixed size,and the designed discriminator outputs as a matrix.The value of each point in the matrix corresponds to the probability that a area block in the input image is true,and finally all values are averaged as the discrimination result.Through the mutual game between the generator and the discriminator,the attention of the infrared sample generation model to the local detail information is improved,and the infrared air samples of the complex environment are enriched.The experimental results show that the deep learning single-stage infrared aerial object detection algorithm studied in this paper significantly improves the accuracy of infrared aerial object detection and reduces the false detection rate and missed detection rate.The proposed integrated attention module,two-way feature fusion network and frame regression loss based on IoU are universal in single-stage object detection algorithm.By redesigning the structure of generator and discriminator,an infrared sample generation model based on conditional generation countermeasure network is proposed,which can effectively enrich the data samples required by infrared air object detection algorithm. |