Font Size: a A A

Research On Object Tracking Algorithm Based On Twin Networks

Posted on:2024-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuoFull Text:PDF
GTID:2568307106953339Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The field of visual tracking is an important research direction in computer vision,with the rapid rise of artificial intelligence,object tracking is widely used in video surveillance,national defense security,intelligent transportation and other fields.Although the target tracking technology has made breakthroughs,when the actual tracking algorithm is applied,the target will be disturbed by complex factors such as occlusion,deformation,and similar targets,while the background clutter,lighting,and blurring factors bring challenges to target tracking.Based on the deep twin network target tracking framework,the twin network obtains the target location by entering two similar measures of the target area and the search area,and the branch shared by the two parameters is extracted for feature extraction,which effectively improves the tracking accuracy and success rate.However,the ability to extract deep semantic information of features is poor,and there are certain shortcomings in target state estimation,and this paper proposes an improvement strategy for the target tracking algorithm of twin networks,as follows:(1)The backbone networks of algorithms such as SiamFC and SiamRPN are shallow networks extracting target features Extract deeper semantic information,better distinguish foreground background,use deep residual network ResNet-18 as the backbone network,reduce parameters while reducing model operation complexity,and remove the last convolutional layer and fully connected layer.At the same time,the hybrid attention module is introduced,and the network architecture is considered from the channel angle and the spatial perspective respectively,and the channel attention mainly learns to highlight the important information of different channels,and the spatial feature information learned by spatial attention is richer.Finally,the feature enhancement module is introduced to extract multi-scale features of the image,so that the expression ability of the local features of the target is stronger,and the local feature information provided by the subsequent Transformer module is richer,making up for the lack of local information.In order to verify the feasibility of improving the algorithm in real life scenarios and tracking the effect of visualization,the algorithm is visually analyzed on a small self-made dataset.At the same time,the proposed algorithm is tested on the public competition datasets OTB100 and VOT2018,and the comparison shows that the tracking accuracy and success rate of the proposed algorithm are better than those of other algorithms compared with the benchmark algorithm SiamFC and other algorithms.(2)Through the encoder-decoder self-attention structure,the global information of the target can be effectively captured,so that the features between long distances can be associated,and finally the template and the search area features can be fully integrated.At the same time,considering that the initial template frame of the tracking twin network takes the first frame in the video sequence as the target template,and the appearance of the target in the video sequence will change with the playback time of the video sequence,deformation,scale change,target occlusion and other factors may occur.On this basis,the method of updating templates is introduced,and a dynamic template branch is added to the algorithm framework,which can effectively pay attention to the shape change of the appearance of the target over time,so that the target can provide richer temporal and spatial feature information.Considering the tracking effect of the algorithm in daily life scenarios,the improved algorithm is also visually analyzed on the small self-made dataset.Finally,the performance is evaluated on the two public competition datasets of OTB100 and VOT2018,and the analysis of various comparative indicators verifies that the proposed algorithm is better than other comparison algorithms.
Keywords/Search Tags:Object tracking, Deep learning, Siamese networks, Attention mechanisms, Transformer
PDF Full Text Request
Related items