Font Size: a A A

Object Detection Algorithm Based On Efficient Self-Attention And Context Information Enhancement

Posted on:2024-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2568307124971929Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Object detection is a crucial area of research in computer vision,involving the classification and localization of objects in images.With the rapid advancements in deep learning,object detection techniques have found widespread used in domains such as autonomous driving,industrial quality inspection,and medical diagnosis.Although object detection technology has reached a certain level of maturity,there still exist challenges and issues that require attention.For example,object detection techniques may be prone to misjudgment or missed detections due to factors like occlusion,lighting changes,and object deformations.Additionally,the problems of detecting small or feature-sparse objects,as well as obscured objects,with low detection accuracy,have not been adequately addressed.To address the issues mentioned above,this paper proposes several methods,such as efficient self-attention mechanisms,attention mechanisms,and context information enhancement,to preserve fine-grained features and enhance the interaction between global and local features,thereby improving the network’s ability to detect small objects.The main research contents of this paper are outlined as follows:(1)Firstly,the research background,significance,and current status of object detection are elaborated.Then,the development process of object detection algorithms,relevant theories,and technologies are summarized.Finally,common datasets and evaluation metrics for object detection are briefly introduced.(2)To tackle the problems of limited effective receptive fields in the middle layers of the network,which cannot fully learn the long-range dependencies between different features,and the effective features of small objects are easily overwhelmed by complex and diverse background information during the feature fusion stage,this paper proposes the Local and Global Interactive Module(LGIM)and Enhancing Channel and Spatial Interaction(ECSI)module.The proposed algorithm is evaluated on the TT100 K dataset through extensive ablation experiments and visualization analysis.The experimental results demonstrate that the LGIM module effectively enhances the network’s global modeling capability,and the ECSI module significantly improves the efficient expression of features at different scales.Moreover,the algorithm that combines both modules outperforms other state-of-the-art detection methods in terms of small objects detection performance.(3)To address the issue of excessive computation and the network’s inability to adaptively adjust the calculation region for image content when traditional selfattention mechanisms are applied to image data.,this paper proposes the Dynamic Selection feature Module(DSM)to improve the computational efficiency and enhance information interaction across window features.To address the issue of the inability to enhance detail and semantic features during the feature fusion stage,this paper proposes a Combined Details and Semantics Enhancement(CDSE)module to refine detail features and semantic features.Extensive ablation experiments and visualization analysis fully demonstrate the effectiveness of the proposed modules.The detection accuracy of this algorithm on the PASCAL VOC dataset has reached 90.1% while significantly improving the detection accuracy of small objects.
Keywords/Search Tags:object detection, self-attention mechanism, attention mechanism, feature enhancement
PDF Full Text Request
Related items