Performance Im Provement Of Attention-Based Detection Models

Posted on:2024-04-11

Degree:Master

Type:Thesis

Country:China

Candidate:S Dai

Full Text:PDF

GTID:2568306932463494

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Computer vision is an important field of artificial intelligence that covers various areas such as surveillance recognition,face recognition,text recognition,and autonomous driving.Object detection is a fundamental problem and the basis for other computer vision tasks.Currently,improving the performance of models by increasing their complexity and dataset richness has reached a bottleneck.Another approach is to introduce attention mechanisms to enhance the model’s expressive power and improve its detection accuracy.The research in this dissertation explores new attention mechanisms and structures and applies them to detection models.Attention mechanisms are adaptive weights generated by deep learning models,which can help the model find certain features in the dataset.Using attention information of different dimensions can focus on improving the model’s expression ability in a certain aspect,such as the importance of spatial position or feature mapping.Therefore,exploring different methods of attention generation of different dimensions and applying them to object detection is a worthwhile research direction to try.Currently,attention mechanisms used in detection models are almost all focused on the backbone network used for feature extraction.Due to the deep network of the backbone,modifying it is not only complex,but applying attention mechanisms to it will also greatly increase the model’s complexity.On the other hand,current attention mechanisms often only focus on one dimension,lacking the ability to explore comprehensive information,and most of the attention generation processes are like black boxes with poor interpretability.Based on the above issues,this dissertation improves the ability of the object detection model through two strategies.One is to redesign the model’s structure,applying the feature maps from the model’s backbone to a cascaded multi-level attention structure to generate attention information of multiple dimensions,thus fully exploring different dimensions of information and improving the model’s final feature representation.The other is to consider the black box nature of deep learning models and the large amount of feature expression data generated by convolutional kernels,and attempt to introduce statistical information into the model.By using classical statistical knowledge,global spatial attention can be achieved,thus improving the model’s localization accuracy and helping the model build better attention,thereby enhancing the model’s performance.The methods and structures designed in this paper have been validated on multiple datasets.Firstly,the model was pre-trained on a dataset containing a large number of images.Then,it was further trained on multiple datasets and tested on multiple test datasets.The effectiveness of the proposed methods was verified through ablation experiments conducted on multiple datasets.

Keywords/Search Tags:

Deep learning, Computer vision, Object detection, Text detection, Attention mechanism

PDF Full Text Request

Related items

1	Research On Single-stage Object Detection Technology Based On Attention Mechanis
2	Research On Detection Technology Of Communication Base Station Antenna Based On Computer Vision
3	Research On Object Detection Technology With Stereo Vision Based On Deep Learning
4	Research On Lightweight Object Detection Algorithm Based On Deep Learning
5	Research On Object Detection Method Based On Deep Learning And Dual Attention Mechanism
6	Research On Object Detection Algorithm Based On Deep Learning
7	Research And Application Of Monocular 3D Object Detection Algorithm Based On Deep Learning
8	Research On Multi-scale Object Detection Method Based On Attention Mechanism And Dense Connection
9	Object Detection And Recognition For Traceability Video Based On Deep Learning
10	Study On Deep Neural Network’s Stability And Application In Video Object Detection