Font Size: a A A

Visual Tracking Based On Multi-scale Learning

Posted on:2019-05-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:W L XueFull Text:PDF
GTID:1368330626951934Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Visual tracking based on video sequences is one of the important topics in computer vision research,and it plays a significant role in many applications such as motion recognition and pedestrian analysis.This research is challenging due to the complexity of tracking issues(such as,occlusion,motion blur,etc).This paper analyzes tracking problem in two perspectives: historical semantic information and continuous decision process.Specifically,the tracking process is a sequential decision-making process,and the historical semantic information provided by the video sequence can support these decisions.Based on this analysis,we summarize three problem: accurate and complete selection of samples;balance and validity selection of samples;effective support of the strategic framework.In this paper,a series of studies are conducted around three problems using different scales of learning methods.The main contributions are as follows:First of all,in order to tackle the incomplete and inaccurate of the samples in most tracking-by-detection method,we develop an object tracking algorithm,termed as multiscale spatio-temporal context learning tracking(MSTC).MSTC collaboratively explores three different types of spatio-temporal contexts,named the long-term historical targets,the medium-term stable scene(i.e.,a short continuous and stable video sequence)and the short-term overall samples.Different from conventional multi-timescale tracking paradigm that chooses samples in a fixed manner,MSTC formulates a low-dimensional representation named fast perceptual hash algorithm(FPHA)to update long-term historical targets and the medium-term stable scene dynamically with image similarity.MSTC also differs from most tracking-by-detection algorithms that label samples as positive or negative,it investigates a fusion salient sample detection(FSSD)to fuse weights of the samples by the visual spatial attention.In the public OTB50 test set,MSTC scored 0.629 in the OPE success rate indicator,experimental evaluations demonstrate the superiority of the proposed algorithm.Secondly,in order to make the sample acquisition balanced and effective,we propose a tracking algorithm based on motion model balance acquisition and model update intelligent adjustment,called Motion Model and Model Updater tracking(MMMU).On the one hand,unlike other algorithms that only focus on the local information(target or background)in the current frame,MMMU uses image segmentation and detection techniques to reach the balance between the target and the background,from the global spatial scale.On the other hand,unlike the frame-by-frame or step-by-step update method,MMMU designs a smarter update strategy,by analysing the similarity of the scene and considering selective amnesia.In the public OTB50 test set,MMMU scored 0.612 in the OPE success rate indicator,experiments have shown that MMMU exhibits excellent performance.Thirdly,in order to gain the effective behavior-level strategic framework,we present a significant tracking framework based on the multi-dimensional state-action space reinforcement learning,termed as multi-angle analysis collaboration tracking(MACT).MACT is comprised of a basic tracking framework and a strategic framework which assists the former.Especially,the strategic framework is extensible and currently,includes feature selection strategy(FSS)and movement trend strategy(MTS).These strategies are abstracted from the multi-angle analysis of tracking problems(observer's attention and object's motion).In the public OTB50 test set,MACT scored 0.630 in the OPE success rate indicator,the experimental results show that the MACT has improved tracking speed and accuracy.Finally,this paper analyzes the nature of tracking and creates corresponding problems.Benefiting from a variety of biological heuristics(human basic memory models,selective amnesia mechanisms,visual saliency,etc.),we analyze the specific problems accordingly.We use different scales of learning methods(single or multiple time scales,local or global spatial scales,multidimensional states-action decision scales,etc.)to model and solve these problems.Good experimental results show that visual tracking research can be not limited to the optimization of machine learning algorithms,the adjustment of deep learning networks and the acquisition of high-precision image features,etc.,but also can improve performance by thinking about the tracking problem itself.The analysis of tracking behavior is designed to make computer vision tracking research more interpretable.And above analysis methods are beneficial for other visual research studies,such as visual inspection and scene analysis.
Keywords/Search Tags:Visual tracking, multi-scale learning, historical semantic information, decision-making process, reinforcement learning
PDF Full Text Request
Related items