| With the rapid development of remote sensing technology,the acquired remote sensing images have rich feature information,and remote sensing images are widely used in military object identification,environmental monitoring and other fields.In recent years,with the rapid development of computer vision based on deep learning,the combination of remote sensing technology and computer vision is getting closer and closer,which also brings new solutions for remote sensing object detection.However,remote sensing images are influenced by the target and the angle of the image,the background complexity of the target,the low percentage of target pixels in the image,the dense arrangement and the arbitrary angle,the existing target detection algorithms may have a large number of missed detection and false detection when applied to remote sensing image target detection,and the detection speed is slow and does not have good generalization performance.(1)To address the problem of insufficient accuracy of existing algorithmic detection in performing remote sensing image target detection,a cyclic asymmetric convolution module is proposed to improve the network’s ability to extract global contextual information while enhancing the rotational invariance of the target.Use of a dynamic global local feature fusion module at the neck of the network to incorporate global information into each layer of the feature pyramid to improve the accuracy of subsequent detection tasks.A dynamic gating mechanism is introduced to adaptively select whether to include the module or not,achieving a better balance between accuracy and computational cost.For arbitrary target angles,a dense tag encoding approach is introduced to design the rotating detection head to increase the network’s ability to detect rotating targets in any direction.(2)A lightweight network improvement method is proposed to address the slow speed of existing algorithm detection in performing remote sensing image target detection.A new C3 e module has been designed to transfer shallow feature information to deeper layers,avoiding the problem of information loss during transfer to deeper layers.A combination of convolutional operations and maximum pooling is used for downsampling to ensure smooth network information transfer.To address the task conflict problem of coupled detection heads,a rotating decoupled detection head with implicit learning capability is proposed to resolve the task conflict and improve the network’s ability to capture implicit information.On the DOTA-v1.5 dataset,detection accuracies of 73.4 m AP50 and 72.9 m AP50 and speeds of 87 fps and 96 fps were achieved by improving the YOLOv5 mini-model YOLOv5 s,respectively.The experimental data show that the detection accuracy of this method is improved by 9.8% and 9.5% respectively with a small difference in the detection speed and the baseline model.The method has good performance in remote sensing scene target detection.Some experiments have also been done on natural scene dataset and good results have been obtained.The method in this paper has good generalisation. |