Font Size: a A A

Research On Robust Visual Tracker With Siamese Network

Posted on:2022-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:C H LiuFull Text:PDF
GTID:2518306605965989Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Visual tracking has always been one of the hottest and most challenging tasks in the computer vision field due to its wide range of application scenarios.The difficulty of the visual tracking task lies in the interference of external factors and the changes in the appearance of the target itself.In unrestricted tracking scenarios,the uncertainty of external interference and the dynamic changes of target attributes will greatly increase the difficulty of tracking.Recently,the tracking algorithm based on the siamese network has attracted the attention of researchers in the field of visual tracking.This type of algorithm uses the siamese subnetwork as the backbone network to extract the multi-scale embedded features of the target,and at the same time it uses the idea of cross-correlation filtering to measure the similarity between the template features and search frame feature,so the siamese network tracker can effectively balance the speed and accuracy of tracking.However,the existing siamese network trackers usually only use the target state in the initial frame as a template feature to match the target in the subsequent search frame.Though this strategy is simple,in the long-range tracking scenarios,due to significant changes of the appearance of the target,the initial template features cannot effectively track the dynamic changes of the target.Siamese tracking algorithm based on an anchor-free framework is designed to directly generates the bounding box information by regressing the four offsets of the position of the positive sample in the response map relative to the true bounding box.The algorithm framework can be decoupled into three parts:siamese sub-networks,cross-correlation operations,and bounding box prediction networks.Among them,the siamese sub-network is used to extract the target's template feature and the search feature of the current frame,and the crosscorrelation operations uses a separable depth-wise convolution to calculate the correlation degree of each channel feature.In order to improve the correlation between the template feature and the current search frame feature,a convolutional network is used to fuse response feature maps at hierarchical levels.After that,in order to enhance the representation of the features,the context dependency of different channels and locations of the response feature map is modeled based on the attention mechanism.The prediction network is divided into classification branch and regression branch.The classification branch is used to identify the target and background,and the regression branch is used to generate the bounding box of the target.At the same time,in order to improve the prediction accuracy of the tracking algorithm,a scale-regularized intersection over union ratio loss function to is used to train the bounding box regression network.This loss function can not only measure the fitness between the generated bounding box and the groudtruth,but also consider the scale of the target.We tested the performance of proposed tracker on three benchmarks:OTB50,UAV23,and Got10 k.The experimental results show that our algorithm can achieve the-state-of-art.In order to solve the degradation of template of tracker based on the siamese network in the long-range tracking scenario,a multi-template update strategy is proposed to improve tracking stability.During the tracking,a template pool is adaptively maintained to store the dynamic changes of the target appearance.The multi-template update strategy can be decomposed into three parts:template update decision,template selection strategy,and template fusion strategy.Among them,the update decision of the template pool is determined by the distribution of the category probability int the response map outputed by the prediction network of the tracker itself.During the test period,the initial template feature is regarded as the query.By calculating the similarity between the initial template feature and all elements in the template pool,the best matching element is retrieved from the template pool as the current template feature to assist the initial template features to generate more robust tracking results.We integrate the multi-template update strategy into the siamese network tracking algorithm proposed,and tested the performance of the proposed multi-template update strategy on the long-range tracking benchmark dataset.Among them,the evaluation indicators on the OTB50 and UAV123 benchmarks can be Increase 2% and 1%.Experimental results show that the multi-template update strategy proposed can effectively improve the accuracy of the tracking algorithm at the expense of a small amount of tracking efficiency,and achieve a balance between tracking accuracy and tracking efficiency.
Keywords/Search Tags:visual tracking, siamese network, correlation filter, template update, convolutional neural network
PDF Full Text Request
Related items