Algorithm Study On Object Tracking Via Language And Visual Model

Posted on:2020-08-24

Degree:Master

Type:Thesis

Country:China

Candidate:C X Li

Full Text:PDF

GTID:2428330590463148

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

As one of the most basic visual abilities of human beings,visual object tracking has always been the focus of attention for academia and industry.And the visual object tracking has been successfully applied in Monitoring,human-computer interaction,assisted driving and other fields.Nevertheless,how to make the object tracking technology reach large-scale application and industry standards in terms of actual performance indicators such as accuracy,robustness and real-time performance is still a highly open problem.In particular,the existing trackers are suffering some problems including the roughening of the target apparent modeling method,the insufficient fusion of multi-modal information,big data dependence and so on.Recently,the trackers based Siamese network have drawn increasing interest in visual tracking.It achieves a good balance between accuracy and speed.However,most of them suffer from significant appearance variations and similar distractors.Because they mainly focus on offline constructing a matching network without online updating and only the first frame target feature is used as the only clue for target search.To address this problem,we propose two algorithms.One is a novel hierarchical tracking method?named Hi-Tracker?via adaptively fusing Siamese features and another is a multi-branch Siamese tracking algorithm based on semantic modeling and appearance modeling?named SegA-Siam?.?1?Our Hi-Tracker integrates the discriminative correlation filters into the Siamese matching network via an end-to-end training manner to improve the discriminative power of each feature layer.Then,based on an analysis of a simple yet effective online motion model and the peak-versus-noise ratio?PNR?of the response maps,our Hi-Tracker incorporates a fast transformation learning model into the network to capture target appearance variations and improve its robustness to similar distractors,respectively.Finally,our Hi-Tracker fuses a variety of the network outputs from complementary Siamese features to estimate the optimal target state.Experimental results on OTB2013^[1]and OTB2015^[2]shows that our Hi-Tracker cannot only achieves a competitive performance among other state-of-the-art trackers,but also runs at a real-time speed of 25 FPS on the GPU.?2?In SegA-Siam,we use natural language to locate the target coarsely,and then use visual features to fine search the location of the target.Specially,in order to improve the discriminant ability,we use Long Short Term Memory Network?LSTM?^[3]to model the appearance.SegA-Siam consists of two branches,both of which are Siamese networks.One branch uses natural language to understand semantic of candidate region and another branch uses bidirectional LSTM to build a robust model for appearance of target.The branch of semantic understanding which has the similar structure with the SiamFC^[4]is used to classify the foreground and background of the candidate region and get a binary segmentation mask.In the appearance modeling branch,the Bidirectional LSTM is used to process the depth features.And the depth features are input into the network from left to right according to the width of the features.The association during the object features can be enhanced to improve the discriminating ability.The two branches are not combined at training time until the test time.Two response maps are weighted fusion as the final response map to determine the position of the target.Through observation,the peaks in the response map are mainly concentrated near the target,and the highest peak position is not exact the target position.Therefore,multiple peaks are selected.Each peak corresponds to an exact target box.The overlaps between each candidate target box in current frame and target box in previous computed.The final target frame is judged by combining the peak values and the overlap rate.

Keywords/Search Tags:

Visual Object Tracking, Deep Learning, Siamese Neural Network, Natural Language, Correlation Filter

PDF Full Text Request

Related items

1	Research On Correlation Filter And Siamese Network Hybrid Algorithm For Visual Object Tracking
2	Research On Visual Object Tracking Algorithm Based On Deep Learning
3	Object Tracking Based On Correlation Filter And Deep Model Compression
4	Research On The Algorithms Of Deep Neural Network Based Robust Visual Tracking
5	Research On Target Tracking Based On Siamese Network And Correlation Filter
6	Research On Object Tracking Algorithms Steered By Recurrent And Siamese Neural Network
7	Research On Real-time And Robust Object Tracking Based On Correlation Filter And Siamese Network
8	Research On Visual Object Tracking Algorithm Based On Siamese Neural Network
9	Research On Visual Tracking Based On Deep Siamese Network
10	Visual Tracking Based On Deep Learning