Font Size: a A A

Research On Visual Object Tracking Based On Deep Siamese Network

Posted on:2022-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:S L ChengFull Text:PDF
GTID:2518306527484344Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Visual object tracking is widely used in video UAV,video security,smart city and etc.The task of visual object tracking is to predict the actual position and size of the object in the subsequent video frame when given the information of the object to be tracked in the initial frame.It is difficult to ensure the tracking accuracy and speed in the complicated background environment.Based on the deep siamese network structure,this paper studies the technical problems existing in the current visual object tracking algorithm.The specific research work is as follows:(1)The features extracted by convolutional network contain low-level,mid-level and highlevel abstract information.Many Siamese network tracking algorithms only use the last layer of convolutional network to obtain the response map when performing cross-correlation operation,and do not make full use of the features extracted from each layer of the network.To solve the above problems,a target tracking algorithm on multiple features fusion is proposed in Siamese framework.Firstly,the backbone network is used to extract the features of each level of the target.Secondly,multiple features fusion is used to fuse the features of the middle layers.Finally,the fused features and the last layer features of convolutional network are used to obtain the two-stage response maps,and the two-stage maps is fused to get the position of the target in the response map.(2)The redundant channels brought by the pre-trained network and the similar interference objects around the target can lead to features extracted by deep network can not be fully applied to the target.To solve the above problems,in the framework of deep siamese network,a channel selection target tracking algorithm based on gradient guidance is proposed.Firstly,the gradientguided module is embedded behind the pre-trained network to select the feature channels with strong expression ability for the current target.Secondly,the switch-penalty function is used to eliminate the similarity interference objects.Finally,in the template branch and search branch of siamese network,the weighted response score map is obtained by using multi-channels cross correlation to make the location more accurate.(3)In tracking field,the size and appearance of object will change greatly over time.If only the initial frame information is used to build the object appearance model,the tracking method can not capture the information when the object appearance changes,and the tracking precision and accuracy will be affected in the tracking process.In order to solve the above problems,we propose a tracking method based on target time sequence information to model the historical information of the target.On one hand,the target initial template feature,current frame template feature and previous frame fusion template feature are input into the timing fusion module,which fuses the target historical frame in-formation and current frame information as the next frame feature input into the tracking method.On the other hand,in order to prevent the adverse effect of target drift on time series fusion,the target history template depository is established during tracking,and the target template is modified in each frame according to the template similarity during tracking.
Keywords/Search Tags:Object tracking, Siamese network, Multiple features fusion, Gradient-guided network, Temporal information fusion
PDF Full Text Request
Related items