| In recent years,deep learning technology is developing rapidly.As one of the important research tasks in the field of deep learning,object detection is widely used in human life,which is of great significance to civil life and military development.However,the detection of small objects is always a difficulty in the object detection task.There are some problems in the detection of small targets,such as low resolution,unobvious features and easy to be interfered by background,especially the detection of infrared dim small targets is more difficult.Moreover,many current target detection algorithms are difficult to balance accuracy and speed.Most of the small target detection research is only dedicated to improving accuracy at the expense of detection speed and making the model too complex,resulting in the inability to achieve real-time detection and device deployment.Therefore,designing a lightweight and efficient small target detection network has very important research significance and application value.In response to the above challenges,this paper selects YOLOv5s with excellent comprehensive performance as the benchmark model.Based on it,two lightweight and efficient detection models,YOLOv5s-P34 SC and YOLOv5s-P23 C,are proposed to improve the comprehensive performance of small target detection and to solve the problem of difficult infrared dim and small target detection.The experimental results show that the two models designed in this paper effectively improve the detection effect of small targets.The specific innovations and work are as follows:(1)Based on YOLOv5s,we use SIoU as the target frame regression loss function,which improves the problem of direction matching in small target detection,speeds up the model convergence rate,and improves the detection accuracy and speed to a certain extent.During the image feature extraction process by convolutional neural network,with the deepening of the network,some details of the target are lost,so the deep features extracted by deep neural network are not conducive to small target detection.Therefore,in the FPN+PAN structure of the benchmark network,we only retain the feature map detection layer after 8 times and 16 times downsampling fusion to obtain the P34 detection network.This not only simplifies the network structure and speeds up the detection speed,but also effectively improves the detection accuracy.We introduce the lightweight upsampling operator CARAFE into the FPN structure in the P34 detection network to obtain higher quality feature maps,which further improves the detection accuracy.Through the ablation experiment,we verify that different improvements improve the benchmark network YOLOv5s.Compared with the YOLOv5s model,the detection accuracy mAP0.5 of the YOLOv5s-P34 SC model obtained by combining these three improvements improves by 1.1% on the VisDrone2019-DET verification subset and by 0.8% in the test set.The detection speed is increased by 5.4%,and the number of parameters is reduced by nearly 25%.It can also be seen more intuitively from the detection effect diagram that the improved algorithm effectively improves the detection ability of small targets.YOLOv5s-P34 SC has more advantages in comprehensive performance compared with several other classical deep learning target detection algorithms.(2)For infrared dim and small target detection,the proportion of pixels is low and the characteristics are not obvious,which leads to missed detection and false detection in the detection process.We designed a lightweight and high-precision infrared dim and small target detection model YOLOv5s-P23 C.We first collected two kinds of infrared small target public data sets to improve the generalization ability and then preprocessed the data sets.We used the FFmpeg tool to extract part of the video data into picture frames for subsequent model training.Then,we generated target label information by 3×3,5×5 extension box annotation,fine tuning and manual annotation in the data set according to the size of target points.We used Mosaic-9 data enhancement in the YOLOv5s algorithm,which effectively enriched the sample characteristics and number of infrared dim and small targets,so as to improve the performance of the model in detecting small-scale targets.On the basis of YOLOv5s,a high-resolution feature fusion path fused with shallow 4 times downsampling is constructed to obtain P2 detection head,which can be used to detect smaller targets.It can transfer more shallow features to the deep network and enrich the feature map information.The FPN+PAN structure retains only the newly constructed P2 feature fusion path and the P3 feature fusion path used to detect small targets in order to obtain the final P23 dual-scale detection structure.The structure also introduces the lightweight upsampling operator CARAFE into the FPN structure to further strengthen the feature extraction ability.Finally,we verified the effectiveness of each improvement by ablation experiments.The final improved algorithm YOLOv5S-P23 C meets the premise of high real-time detection.On the infrared dim and small target dataset,compared with the benchmark algorithm YOLOv5s,the Precision is increased by 1.5%,the Recall is increased by 3.9%,the mAP0.5 is increased by 4.5%,and the number of parameters is reduced by nearly 22%.The number of parameters is reduced by nearly 22%.The detection effect and experimental indicators are better than several mainstream YOLO series algorithms.Moreover,it can be seen from the detection effect that YOLOv5s-P23 C effectively reduces the false detection and missed detection of infrared dim and small targets and improves the overall detection effect. |