Font Size: a A A

Research On Object Detection Methods Based On Cross-layer Attention

Posted on:2024-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:H T ZhengFull Text:PDF
GTID:2568307157983489Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
In recent years,object detection networks have made significant research progress and have been widely used in medical diagnosis,industrial manufacturing,and autonomous driving,among other scenarios.Attention mechanism has great potential in object detection,as it can help models accurately identify objects.Currently,multi-scale feature map prediction strategies are commonly used in object detection networks.However,existing attention methods only use refined single-layer feature maps or fused two-layer feature maps,without integrating more levels of feature maps for attention calculation.This approach fails to model the relationships between different levels of feature maps,resulting in limited information that the attention module can learn,and refined feature maps lacking rich contextual information.Therefore,in this thesis,we propose two cross-layer attention modules for multi-scale object detection networks as follows:(1)This thesis focuses on the problem of channel attention’s inability to model the interdependencies among channels in multi-level feature maps.We propose a Cross-Layer Feature Attention Module(CFAM)that explores multi-scale feature fusion and attention mechanisms.CFAM consists of cross-layer feature fusion and cross-layer feature refinement.Cross-layer feature fusion is used to fuse all the predicted feature maps into cross-layer feature maps,while cross-layer feature refinement models the mutual dependencies between channels in different layers by calculating the attention on cross-layer feature maps,and uses attention weights to weight the initial feature maps,thus generating more informative feature maps.Experimental results demonstrate that the introduction of CFAM significantly improves the detection accuracy and speed of the SSD baseline model,while also enhancing the detection capability of small and dense objects.(2)This thesis focuses on addressing the issue of CFAM only extracting local channel features while neglecting global spatial features.We propose a Cross-Layer Feature SelfAttention Module(CFSAM)based on the self-attention mechanism.CFSAM consists of three parts: local feature extraction,global feature extraction,and feature fusion and restoration.The local feature extraction part models the local information of the input feature map,the global feature extraction part fuses multi-scale feature maps to generate initial feature vectors,and uses Transformer calculation units to globally model the feature vectors for more refined feature vectors.The feature fusion and restoration part fuses the refined feature vectors with the initial feature vectors and restores the feature maps at the initial scale.Experimental results demonstrate that using CFSAM can improve the object detection capability of the SSD baseline model in complex scenes.Moreover,during the training process,CFSAM can also accelerate model convergence.Through comprehensive comparative experiments,the results show that both crosslayer attention modules can significantly improve the detection accuracy of multi-scale object detection networks,and enhance the detection capability of small and dense objects.
Keywords/Search Tags:attention mechanism, object detection, feature fusion, feature enhancement
PDF Full Text Request
Related items