| Remote sensing image has a wide range of applications,and the key is target detection technology.Simultaneously,remote sensing image has the characteristics of complex background,diverse target scale and target direction,etc,which makes it more difficult to detect target.In recent years,target detection in remote sensing image by adding improved modules or function constraints on the deep learning framework of target detection in natural scenes has become the mainstream method.However,the basic framework copes with the natural scene image,which limits the improvement space of model accuracy when it is applied to remote sensing image.Therefore,the paper builds an easily extended target detection framework for remote sensing image by considering the characteristics of remote sensing targets and introducing task-oriented attention.To further improve its performance of generalization and detection accuracy,a target-guided attention is designed.The main contents of the paper are as follows:Firstly,the core links and typical technology of target detection in deep learning method is summarized.By analyzing the typical target detection framework with considering the characteristics of remote sensing targets,it is pointed out that there are some structural limitations that cannot balance the location accuracy with classification accuracy and detect small remote sensing targets,which makes the idea of constructing remote sensing target detection framework introduced.And then,a subtask attention-guided target detection method(St AN)in remote sensing image is proposed.Based on a single shared network,a dedicated sub-task attention mechanism is designed to construct two sub-task networks for location and classification,and both of them are optimized effectively at the same time,which makes the model has high classification accuracy and location accuracy.In addition,in order to avoid the inability to detect remote sensing small targets,the task redistribution strategy is proposed to endue location sub-task network with classification function,and in combination with the guidance mechanism established by attention map to construct the target guidance between the subtask networks,which effectively improved the overall performance of the model.Experiments show that St AN can effectively optimizinge the classification and location at the same time,and it can guarantee the accuracy of remote sensing target detection while still having the advantages of strong scalability and wide application range.Finally,a method that the multi-dimensional attention embedded into St AN(St ANMAG)is proposed for the aim of further improve the generalization of the model in complex scenarios.Multi-dimensional sampling module(Md S)is introduced into the network to capture the diversity features by fusing the diversity convolution kernel.And then,the improved channel and spatial double attention of global visual attention GMul W to enhance the contour and spatial context information in the feature map is also embedded to guide the network to mine target.Besides,target-based pixel attention(TPA)moduel is introduced to generate foreground attention map,guiding the network to exploit the features of the target area.Experiments show that St AN-MAG improves the performance in complex scenarios,and the ablation experiments further prove the effective contributions of Md S,GMul W and TPA to performance improvement. |