Font Size: a A A

Research On Algorithm Of Object Tracking In Complicated Scenes

Posted on:2016-12-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:X ChengFull Text:PDF
GTID:1108330503977870Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Visual tracking technique is one of the most important research topics in computer vision. It is a necessary precondition of activity recognition and semantic-based image understanding, and it has a wider application in video surveillance, intelligent transport and robot navigation. The key problem in visual tracking is how to represent the object of interest efficiently, and exclude the external interference factors, such as pose variations, shape and illumination changes, camera motion and occlusion etc. Although much progress on the object tracking research have been made in the past decades, practical experience has shown that moving object tracking algorithms in complex scene are currently far from mature. Therefore, it is still a challenging problem. In this thesis, based on studying the traditional methods in visual tracking, we intensively research on object appearance model, template updating schedule and tracking optimization framework to get the accurate tracking performance of the object. The main work and contribution of this thesis can be summarized as follows.1) A visual tracking with SIFT features and fragment based on particle swarm optimization (PSO) is proposed. Our tracking algorithm takes into account for the local information of target and SIFT feature points matching between different frames. Local information of the target can handle the partial occlusion; SIFT feature points matching between the different frames is able to recover the lost object. First, a candidate state is partitioned into multiple non-overlapping patches, and each patch denotes a different part of target. Meanwhile, a saliency evaluation function for each patch is defined. During the tracking, we track the object with high saliency value patches. When the target drifts, it can be recovered by matching SIFT feature of current image frame with the target template. In addition, the location information of matched SIFT feature points is integrated into the iterative results of PSO to acquire a more accurate tracking state. Finally, we only update the superior patches of the tracking target box, whose metric indicate low possibility of occlusions from the cluttered background and unreliable tracking scene; while inferior patches are not updated. Therefore, the proposed updating method can avoid introducing the noise into the appearance template. Our tracking algorithm is verified on various challenging video sequences, and experimental results show our tracker performs favorably against several state-of-the-art tracking methods via qualitative and quantitative analysis.2) A hierarchical associative multi-object detection and tracking algorithm with particle swarm optimization is proposed. To cope with the drift problem of tracking caused by environmental changes, we propose a multi-object tracking algorithm with a hierarchical associative structure which first coarsely matches the objects and then accurately locates them using particle swarm optimization. Context information is integrated into the generation of particles during the coarse matching stage. One object in current frame is matched with the corresponding object of the last frame via particles labels information. In fine location stage, the objects’final locations are determined using particle swarm optimization iteration. The objects’ locations with prominent deviations in the phase of accurate tracking are rectified via Metropolis-Hastings algorithm; meanwhile, the objects’templates are updated. When the occlusion occurs, the systems will adaptively adjust the weights of color cue and motion cue to handle the environment changes and continually track targets. Experimental results show that our tracker can improve accuracy of objects matching as well as reduce the number of false-tracked objects due to taking into account the context information of objects.3) An online tracking algorithm based on superpixel is proposed. In this thesis, we employ superpixel as visual cue and propose two tracking algorithms which are superpixel based L1 tracker (SPL1) and weighted multiple-instance learning tracker (WMIL). For the SPL1 algorithm, the dictionary is constructed to represent the object appearance via superpixel. Then L1 norm minimization problem for each particle (candidate object state) is solved under the particle filter framework. The candidate state with the minimum reconstruction error is regarded as tracking result. In the process of dictionary updating, information of several initial frames is retained to alleviate the drift of object. For the WMIL algorithm, object appearance is modeled based on superpixel. During the tracking, each sample particle is assigned a weight value based on its contribution to the tracking. Then the classifier is trained under the multiple-instance learning framework, and the candidate state for the highest classification score is treated as the tracking result. Finally, we employ these weighted positive and negative bags to achieve the updating of the classifier. Experimental results show that two proposed algorithms can still track the object stably under the circumstance of long-term occlusion, scale variations and illumination changes.4) A multi-task learning tracking algorithm with the collaboration of the generative and discriminative models is proposed. Generative model based tracking methods throw away some useful background information which can discriminate object from background. So they are less effective for tracking in cluttered environments; while discriminative model based trackers can make full use of object information and background information to separate the object from the background region. In this paper, sparse issue of the object state is solved by multi-task learning method. First, each candidate state is regarded as a single task and orderly divided into m overlapped patches within each state. The corresponding patches among different tasks are mined correlations to obtain the joint sparse representation which can reduce the computational cost. Second, we define two likelihood metrics for generative model and discriminative model, separately. The measure for the generative model considers the local information of the object; while overall information of the object is integrated into the likelihood metric of discriminative model. During the tracking, two trackers predict object location separately, and we exploit the tracking results of one tracker to update the other tracker’s appearance template. This update mechanism by two independent trackers avoids the "self-training" problem of one tracker. In addition, the dictionary is also updated by Metropolis-Hastings sample method. Finally, we extensively analyze the performance of our tracking method on challenging real-world video sequences and it outperforms the other state-of-the-art trackers.5) A saliency detection based multi-object tracking algorithm is proposed. Sparse representation technique which can handle noise clutter well has been successfully applied to single object tracking. Meanwhile, it has also shown promising performance on multi-class classification. In this thesis, we are motivated to further explore its potential of classification for multi-object tracking. First, a saliency detector is offline learned, and then objects from each image frame are detected with the detector. The maximum entry in sparse coefficient corresponds to the most similar object. So we use the index of sub-dictionary which corresponds to the maximum of sparse coefficient to know which object the sample belongs to in the last frame. Second, we propose a stochastic gradient descent scheme to achieve online dictionary updating and consider the acquired dictionary by saliency detector as the initial dictionary. Appearance model is trained by the first ten frames of a video, and we find each atom in the initial dictionary corresponds to which object. The dictionary contains not only objects templates, but also background information, resulting in more robust estimation. In addition, the local information of object, the relative location of two objects and relative locations between the matched feature points and object center are used to handle occlusions and keep up with the indistinguishable objects. Experimental results show that sparse representation based tracking method can reduce number of false matching objects and improve tracking performance.
Keywords/Search Tags:Object tracking, Online learning, Sparse representation, Appearance model, Template update
PDF Full Text Request
Related items