Font Size: a A A

Visual Tracking Based On Deep Cross-Similarity Network

Posted on:2020-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:L Y WangFull Text:PDF
GTID:2428330590497167Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Online visual tracking is a crucial task of in computer vision field,which can be applied to security monitoring,automatic driving and unmanned aerial vehicle systems via statistics,machine learning or deep learning methods to continuously predict the positions of a specific target in an arbitrary video sequence.Visual tracking has experienced four stages including particle filter framework,correlation filter model,deep convolutional feature representation and end-to-end offline training network.The existing algorithms still have the problems of losing occluded targets,being difficult to distinguish instance-level target and being disturbed by background noises.Through the analysis of the related work and in view of the challenges mentioned above,this thesis improves the popular end-to-end siamese network by upgrading the similarity measurement ability and stability in online tracking process.This thesis proposes a visual tracking algorithm based on deep cross-similarity network,which involves siamese convolutional network,cross similarity and attention mechanism methods,aiming at solving the unreliable similarity scores and inflexible offline training parameters in matching models.Firstly,the end-to-end tracking framework is combined by offline supervised training and online parameter updating,so that the algorithm is capable of both general feature description and timely response to video variations.Secondly,based on the siamese convolutional feature,a cross-similarity layer is designed to calculate the matching degree between all feature vectors,and the similarity information surrounding the current position is utilized to enhance the reliability of response map.Thirdly,the attention mechanism is introduced into the cross-similarity network to allocate the weight coefficients adaptively for the various similarities between template and samples during tracking process.The attention layer is able to suppress the effects of background noise and non-target objects,and emphasize the response values within target region.Finally,an independent and unrelated discriminative model is adopted to measure the confidence of predicted bounding box in each frame and judge the online update timing,so that the efficiency is improved and the parameters are hard to overfit.Qualitative analysis and quantitative evaluation on the public benchmark both verify the effectiveness of the proposed algorithm in this thesis.Due to the combination of cross-similarity layer and similarity-based attention layer,the algorithm can deal with appearance deformation,fast motion,partial occlusion,in-plane rotation and other challenges well.In addition,it can also distinguish the expected target from different intra-class objects,thus realize the instance-level position estimation accurately.According to the standard evaluation criterion of visual tracking benchmark,the algorithm proposed in this thesis has achieved obvious improvement on the precision plots especially compared with baseline siamese framework.And comparing with other related matching-based tracking algorithms,the proposed algorithm achieves satisfactory performance.
Keywords/Search Tags:Visual Tracking, Siamese Network, Similarity Measure, Attention Mechanism
PDF Full Text Request
Related items