Font Size: a A A

Research On Appearance And Decision Models For Visual Tracking

Posted on:2020-05-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X LiFull Text:PDF
GTID:1368330614950794Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Visual object tracking is a fundamental task in the computer vision field,which can be widely applied to autonomous driving,human-computer interaction,and high-level video processing.Although much progress has been made in visual tracking,it remains a challenging problem.Its core difficulty lies in how to recognize and locate a target with various changes,only based on one target sample given in the initial frame.Taking the requirements of visual tracking,online,and offline information into consideration,this paper explores the following four key issues in the aspects of the appearance model and the decision model.First,the construction of a robust appearance model for visual tracking.The appearance model should be able to recognize the target with changes and distinguishing the target from background distractors.Second,how to handle the occlusion problem? Occlusion is commonly seen in tracking sequences and it usually results in tracking failures.Third,the noisy accumulation problem caused by online updating.It is crucial to exploit online updating to ensure a robust tracking,however,the noisy samples used in updating may lead to tracking failures.Forth,usage of deep learning models for visual tracking.Despite the good performance achieved by detection models,the sequential information has not been effectively exploited,which is crucial for tracking the target in complicated scenarios.Based on the above-mentioned issues,this paper studies the visual tracking problem on appearance and decision models using online and offline information with the following five parts:(1)As one kind of features is effective in handling a few kinds of object variations,it is crucial to combine several complementary features for constructing a more robust appearance model.This work adopts color,Histogram of Oriented Gradient(HOG),and gray features for visual tracking.Considering the effectiveness of the combination,this part applies a multi-view learning model,which minimizes the distance between the predictions of each feature,for feature fusion.Different from existing multi-feature models,this part uses the complementary features and introduces a more effective fusion model.(2)This part studies the deep learning based appearance models.By analyzing the difference between the characteristics of deep features and the requirements of visual tracking,this work proposes a target-aware deep tracking model,which generates target-active and scale-sensitive features from pre-trained convolutional neural networks based on gradients of the backpropagation.This work analyzes the gap between features from pre-trained networks and effective tracking features.Based on these analyses,it converts the features of pre-trained networks to effective tracking features.(3)This work proposes an occlusion detection-based tracking framework.It first analyzes how the occlusion occurs,and then design an occlusion detection scheme based on multiple instance learning and the Support Vector Machine(SVM)classifier.Considering the complementary advantages of generative and discriminative models in handling the occlusion problem,this work adaptively combines them for visual tracking.When an occlusion occurs,the proposed method exploits the generative model for accurate localization and when the occlusion disappears,the proposed method uses the discriminative model to distinguish the target from the distractors,which alleviate the drifting problem.The promising results on testing videos with occlusions verify the effectiveness of this method.(4)This work develops a dual-margin model for visual tracking.The dual-margin model can not only distinguish the target from the background objects but also be sensitive to target appearance changes.As such,the dual-margin model is able to provide clean and rich target samples by collecting samples with new target appearances and discarding samples with redundant appearances or background noise,which alleviates the error accumulation problem of online updating.The dual-margin model based Siamese tracking framework can handle target variations well and achieve promising results on several datasets.(5)This work proposes an adaptive Region Proposal Model(RPN)for visual tracking.As sequential information is crucial for performing robust tracking,this method introduces the sequential information into the Region Proposal Model by adapting the features and anchors to the size of the current target.Specifically,this method develops an observation model based on the Siamese framework for predicting the observation of the target.Based on predicted observations,this method computes the real target state with a Bayesian inference model.According to the target state,the adaptive RPN model uses a deformation convolution layer to adapt the features and compute the target bounding based on the adapted anchors,which facilitates accurate localization.
Keywords/Search Tags:visual tracking, target-aware feature model, multi-view learning, adaptive anchors, Siamese network
PDF Full Text Request
Related items