Font Size: a A A

Video Object Tracking Based On Siamese Network And Attention Mechanism

Posted on:2022-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:L L ShiFull Text:PDF
GTID:2518306605965949Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rise of intelligent park,intelligent transportation,intelligent military and other fields,visual object tracking technology has been widely concerned by industry and academia.The difficulties to be solved urgently in the task of visual target tracking are as follows.Firstly,because there are too many challenging factors in visual object tracking technology,such as occlusion,deformation,light change,background speckles,similar targets,fast motion,etc.,the tracking algorithm needs to be able to deal with a variety of challenging factors at the same time.Secondly,if the algorithm wants to be put into industrial application,it needs to take into account both of the real-time performance and precision.Based on deep networks,the tracking algorithms are dedicated to improving the accuracy,often ignoring the real-time requirements.The tracking algorithm based on siamese network has the characteristics of high precision and fast tracking speed,which is a hot research direction in recent years.In order to solve the above problems,this paper proposes a light-weight siamese network tracking algorithm based on siamese structure with anchor-free and attention mechanism,which can increase the depth of feature extraction network and reduce the amount of model parameters and calculation.By using multi-scale fusion,anchor-free and attention mechanism structure,the precision,accuracy and robustness of the algorithm and the performance against background clutter,similar targets,deformation and other challenging factors effectively improve.The main work of this paper is summarized as follows.First of all,for the problem that it is difficult to balance the real-time performance and precision of the algorithm,a siamese network tracking algorithm based on lightweight network is proposed.The inverse residual structure block of Mobile Net V2 algorithm is used as the network infrastructure block to build the feature extraction network,further build the siamese tracking network,and construct the multi-scale feature fusion structure with reference to the feature pyramid network.Finally,the algorithm is proposed combined with Siam RPN module to improve the precision of detecting and tracking.Compared with the siamese tracking network based on depth network Resnet-50,the accuracy of the proposed algorithm is reduced by about 0.02,but the training speed is increased by about 3 times.The speed of object detecting and tracking is increased from an average of 10 FPS to an average of 40 FPS,so the detecting and tracking speed is increased by about 4 times.And the tracking frame speed is higher than the video playing frame speed of 24 FPS.It balances the realtime performance and accuracy of the algorithm,and makes the algorithm more practical.Second,aiming at the problems of the detecting and tracking algorithm based on anchor frame,such as the large amount of parameters,the imbalance of positive and negative samples,and the difficulty of fitting extreme shape objects,a siamese tracking network with anchor-free is proposed.On the basis of the lightweight multi-scale fusion siamese tracking network proposed above,the Siam RPN module is replaced by the anchor-free module,and the centerness branch is added,which balances the positive and negative samples and reduces the amount of parameters,and improves the precision of the algorithm by about 0.02.Third,focusing on the situation that challenge factors such as background speckles and similar targets affect the detection and tracking accuracy,attention mechanisms,including channel attention,spatial attention and triplet attention connecting channel and spatial information,were introduced,and ablation experiments were conducted on multiple test data sets.This algorithm solves part of the challenging factors,but when faced with small targets,high-speed motion,low image quality and other challenging factors at the same time,this algorithm also performs poorly.In the later stage,I consider adding online learning module to improve the performance of the algorithm to deal with different challenging factors.
Keywords/Search Tags:Visual Object Tracking, Siamese Network, Lightweight, Anchor-free, Attention Mechanism
PDF Full Text Request
Related items