Studies On One-shot Based Deep Visual Tracking

Posted on:2019-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Yao

Full Text:PDF

GTID:2428330566498105

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Visual tracking is one of the most challenging task in computer vision.In recent years,with the rapid development of deep learning,more attentions have been paid to deep oneshot based trackers.Deep one-shot means based on training on large amounts of data and tracking without online adoptation,which results in real-time tracking speed.But almost all of the one-shot based deep trackers only apply the last output of deep convolutional network as the feature representation,which contains more semantic details but the low resolution can not meet the requirement of accurate localization during tracking.At the same time,there are much easy samples in the the training progress.Even though the corresponding loss is small enough to be ignored,much such easy smaples may have a great effect on the total loss and can affect the training and tracking performance.Take above issues into account,we first propose two kind of feature fusion methods motivated by the human pathway,the first one is to add the response maps from different layers with different weights while the another one is a top-down modulation which further considers the realtion among different layers.In order to handle the imbalance between the number of easy and hard samples,We further propose the online hard negative mining and hinge loss based method to handle the imbalance between the easy and hard samples.The experiments on several famous datasets demonstrates the effectiveness of the proposed method but is still far away from the state-of-the-art methods.Deep one-shot based trackers can't predict the appearance and background variations in the following frames because of the lack of series information.Thus we further design a one-shot based manual annotation experiment and find that even human has strong learning ability but still can't deal with some distractors such as appearance variation and motion blur.And we find that the annotated results are far away from the state-of-theart trackers which motivates us to take information between frames into consideration.A framework for joint learning of deep represenation and truncated inferenece(RTINet)in visual tracking is proposed,which sheds some light on incorporation the advances in deep representation learning and CF modeling for improving the performance.The proposed RTINet framework achieves favorable tracking accuracy against the state-ofthe-art trackers and its rapid version can run at a real-time speed of 24 fps.

Keywords/Search Tags:

Visual Tracking, One-shot Learning, Deep Learning, Correlation Filter

PDF Full Text Request

Related items

1	A Research Of Visual Tracking Algorithm Based On Deep Learning
2	A Study On Adaptive Multiple Correlation Filter Algorithm For Visual Tracking
3	Research On Correlation Filter Based Visual Object Tracking With Deep Image Representations
4	Research On Robust Visual Tracking Algorithm Based On Correlation Filters
5	Learning Spatial-temporal Consistent Correlation Filter For Visual Tracking
6	Scale-adaptive Visual Target Tracking Based On Kernel Correlation Filter
7	Research Of Visual Tracking Based On Correlation Filter
8	Visual Target Tracking Based On Optimized Ensemble Learning And Spatial Correlation Filter
9	Research On Deep Residual Learning-Based Visual Object Tracking Algorithm
10	Few-shot Learning Method Based On Multiple Distance Metrics And Its Application In Target Tracking Task