Font Size: a A A

Research On Object Tracking Algorithm Based On Attention And Transformer

Posted on:2024-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y YaoFull Text:PDF
GTID:2568307118475724Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Object tracking is an important research direction in the field of computer vision and is widely used in many fields such as autonomous driving and security monitoring.However,traditional object tracking technology only use visible light images as the data source,which often makes it difficult to achieve accurate tracking in harsh environments due to the limitations of the imaging mechanism of visible light images.To overcome this problem,RGBT(Red-Green-Blue and Thermal)object tracking technology that combines infrared and visible light images has been proposed.RGBT object tracking breaks the limitation of single-mode data on information and has better tracking performance,which can be applied to more complex background environments.In recent years,with the continuous advancement of related work,the robustness of RGBT object tracking algorithms has been continuously improved.However,RGBT object tracking algorithms still face many challenges,especially the important issue of how to use complementary information between modalities in RGBT object tracking tasks.This thesis is based on deep learning technology and focuses on the problem of feature fusion and feature extraction between modal images.The main work includes the following two aspects:1)In RGBT object tracking tasks,different modal images can provide varying effective information in different challenging environments.If the modal images can be fully utilized according to the challenge conditions,sufficient robust feature representation can be extracted,thereby improving the tracker’s discriminative ability.This thesis proposes an RGBT object tracking algorithm based on attention mechanism.This method designs an attribute feature extraction module that can extract features from images under different challenges.The module reduces the influence of background information through spatial attention and adaptively enhances foreground information.Using channel attention,it learns the channel weights of multi-modal features and performs channel-wise feature modify to adaptively extract features from images under different challenges.Furthermore,this thesis also designs a modality fusion module in the network,which can combine the visible light image features,infrared image features,and challenge features based on similarity,to achieve mutual propagation of modality features and enhance the underlying image texture features through residual unit learning.The module realizes multi-modal feature fusion and mutual enhancement,thereby improving the robustness of tracking.2)Currently RGBT object tracking algorithms mostly based on MDNet,which can achieve high accuracy but cannot meet the standard of real-time tracking in terms of tracking speed.This article proposes an RGBT object tracking algorithm based on Siamese networks and Transformers.Siamese networks have a simple and clear network structure and the characteristic of transforming tracking tasks into matching tasks,which can reduce computation and have advantages in computing efficiency.An adaptive fusion module is designed in the network,which receives multi-scale modality features and adaptively fuses them to extract feature expressions that retain more detailed information.The model adopts a relational reasoning module to achieve global relational reason of features.At the same time,the Transformer mechanism is introduced,which effectively combines template features and search image features using self-attention and cross-attention to obtain more prominent interactive response maps,enhancing the model’s ability to discriminate targets and improving the tracking performance.The performance of the proposed method is verified on the RGBT234 data set and GTOT data set.Compared with the other recent modal tracking algorithm,the tracking accuracy,success rate and tracking speed are significantly improved.At the same time,the ablation experiment and visualization experiment are designed for the method in this thesis to verify the effectiveness of the model composition module and the implementation method.
Keywords/Search Tags:RGBT object tracking, Attention mechanism, Transformer, Siamese network
PDF Full Text Request
Related items