Font Size: a A A

Robust And Real-time Visual Tracking Under Complex Scenarios

Posted on:2018-11-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:F X ZengFull Text:PDF
GTID:1318330518494729Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Visual tracking is extensively used in intelligent video surveillance, computer-aided medical diagnosis, human-computer interaction and intelligent transporta-tion. It has gained wide attention and much progress has been achieved in the recent decade. However, visual tracking remains a challenging problem in real application scenarios because of two aspects: first, an object's appear-ance varies considerably due to both intrinsic (e.g., pose variation, scale change)and extrinsic (e.g., varying illumination, occlusion) factors; secondly, real-time processing requirements in real-world applications demand high computational speed, thereby limiting the use of overly complex approaches. This thesis fo-cuses on effective appearance modeling which is critical for the success of a tracker. Main contributions include:(1) A Real-time visual tracking algorithm under illumination changes is proposed.Visual tracking under illumination changes is a challenging task for nu-merous computer vision applications. In this thesis, a new feature descriptor called maximum color difference histogram (MCDH) and a well-designed min-max-ratio similarity metric (MMR) are proposed to build the object appearance model and evaluate the similarity between a candidate and the model, respec-tively. MCDH is robust to illumination variations and is capable of keeping the salient features of the object. The local integral histogram (LIH), which is propagated in a specially designed local image region, is introduced to fast extraction of MCDH. Since MCDH has more zero bins, a more suitable metric called min-max-ratio metric, defined as the average ratio between the minimum and maximum of a MCDH bin pair, is proposed to compare two MCDHs. The combination of MCDH and MMR enables the tracker robust to illumination changes with high computational efficiency. Experiments demonstrate superior or competitive performance of the proposed tracker compared to state-of-the-art tracking methods when illumination varies and the tracker can partially deal with complexities such as scale changes,deformation and occlusion.(2) A real-time kernel based multiple cue adaptive appearance model based tracking algorithm is proposed.In order to adapt to complex scenarios, tracking-by-detection is employed to achieve robust object tracking. However, these methods suffer the drifting problem. One important reason is that the adopted Haar-like features may con-tain background information near the boundaries of the object. Another reason is that these methods only use intensity information because of efficiency con-sideration. Therefore,in this thesis, a novel kernel based multiple cue adap-tive appearance model (KBMCAAM) is proposed for robust and real-time vi-sual tracking. In particular, the appearance model is constructed with a naive Bayes classifier which is trained utilizing sparse multi-scale Haar-like features weighted by a spatial kernel function. Moreover, multiple image cues are inte-grated within the naive Bayes framework to improve the model's discriminative capacity. Experimental results demonstrate the superior performance of the pro-posed method to several discriminative trackers with respect to both robustness and efficiency.(3) A real-time tracking algorithm based on embedding holistic appearance information in part based adaptive appearance model is proposed.Part based adaptive appearance model has been extensively used in in-creasingly popular discriminative trackers. The main problem of these meth-ods is the stability-plasticity dilemma. To alleviate this problem, this thesis proposes embedding holistic appearance information in the part based appear-ance model which is learned/updated online. Specifically, the object is rep-resented by sparse multi-scale Haar-like features and the appearance model is constructed with a naive Bayes classifier. Unlike the conventional methods, the classifier is trained by positive and negative samples that are weighted accord-ing to their similarity with the holistic appearance model which is kept constant during the updating procedure. The constant holistic appearance information,providing some constraints when updating the part-based appearance model,makes the tracker more stable. The online updating procedure of the part-based appearance model makes the tracker adaptive enough to appearance changes.The proposed approach deals well with the stability-plasticity dilemma. Exper-imental results demonstrate the superior performance of the proposed method to several state-of-art discriminative trackers only using local appearance model.(4) A near real-time tracking algorithm based on discriminative bag-of-words adaptive appearance model is proposed.To handle more complicated application scenarios with limited training data in online tracking, bag-of-words model(BOW) is introduced to construct the appearance model by a non-parametric approach. BOW has become popu-lar for computer vision tasks such as image classification and action recognition because of its effectiveness and flexibility. However, it has not been widely studied for visual tracking. In this thesis, a novel discriminative bag-of-words model (DBoW) that can both adapt to appearance variations over time and re-duce the commonly observed drifting problem in online tracking is proposed to construct the appearance model. Specifically, a contextual region containing both the target and its surroundings is explored to construct a compact represen-tation with two bags-of-words. Each visual word is learned to carry discrimi-native appearance cues with the feature-to-bag distance encoded. To alleviate the drifting problem, an adaptive updating approach is introduced to prevent the integration of the background into the object model. Based on DBoW model,a robust and near real-time tracker is proposed where tracking is achieved by searching the candidate that best matches to the maintained DBoW model. In-tegral channel is employed to fast extract features in the dense grids. A sim-ple measure, the mean maximum-tracked-frame ratio (MTFR) is proposed to provide more real-application-based evaluation of the trackers. Experiments demonstrate better robustness of our tracker compared to all other evaluated trackers on the basis of MTFR and competitive accuracy using the available figure of merit.In summary, this thesis concentrates on research on effective appearance modeling. In order to handle various complexities, effort has been made on proposing new feature descriptors to construct the appearance model, building more accurate appearance model, improving the discriminability of the appear-ance model and so on. Additionally, the computational efficiency is carefully taken into account for real-time applications. Extensive experimental results demonstrate the proposed models are adaptive, robust and efficient.
Keywords/Search Tags:Visual tracking, Discriminative appearance model, Bayesian inference, Visual bag-of-words, Feature fusion
PDF Full Text Request
Related items