Research On The Theories And Methods Of Object Tracking And Segmentation In Video

Posted on:2017-03-05

Degree:Doctor

Type:Dissertation

Country:China

Candidate:S Gu

Full Text:PDF

GTID:1108330485988417

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

Video object tracking and segmentation is an important branch of computer vision. It provides the necessary precondition for the reliable operation of the other algorithms as one of the most important underlying technologies. The goal of object tracking is the estimation of the objectâ€™s state in each frame based on the objectâ€™s initial state in the first frame. Object segmentation essentially belongs to the category of object tracking. In addition to the trakcing, object segmentation needs to determine the precise shape of the interesting object in each frame of a video. There are many challenging factors in video object tracking and segmentation such as deformation, occlusion and scale variation. To solve these problems, four novel algorithms are proposed in this thesis. Among them, two approaches are applied to track the rigid object based on mixture of regression mod-els, and some experiments demonstrate our methodsâ€™robustness against scale variation. Other approaches are used to track and segment any form of objects (rigid or non-rigid object) combining with the image segmentation theory, and they have strong robustness against occlusion and scale variation. There are four components in our research:1. Traditional learning-based methods track the object based on the assumption that the relationship between the object appearance and the motion is linear or non-linear. They will result in bad tracking performance when single linear model is adopted. On the other hand, a non-linear regression model should be specified in advance when the relationship is assumed by non-linear. To circumvent the afore-mentioned drawbacks, a mixture linear model is adopted in this thesis, which describes the ground truth model accurately, and decreases the error caused by single linear model. Meanwhile, the pro-posed approach allows the approximation of the parameters in a data-driven way without specifying the distribution approximation in advance. Moreover, a fast learning strat-egy is proposed to decrease the complexity of the learning computation. Computing the inverse of two low-dimension matrices is much faster than computing the one of an high-dimension matrix, and enhances the robustness of the learning model. Especially, this approach constructs the relationship between the object appearance and the motion pa-rameters, and it can estimate the objectâ€™s status directly, such as velocity, direction and scale.2. In mixture linear model, it is still limited because the mixing coefficients are in-dependent of the input data. We can further increase the capability of such models by allowing the mixing coefficients themselves to be functions of the input variable, and it is referred as Mixture of Experts (MoE) where subspaces are divided by "soft boundary" MoE, as an extension of mixture linear model, describes the ground truth model more accurately than single regression model. All modelâ€™s parameters can be learned inde-pendently by maximum likelihood. Moreover, an online algorithm updates the mixture model during tracking, which enhances the systemâ€™s robustness against noise.3. Our proposed learning-based approach has a poor performance in terms of non-rigid object. To solve this problem, a video object segmentation solution based on saliency filter is proposed. A tracking and segmentation method is converted into a salient segmentation solution in this algorithm, and the objectâ€™s spatio-temperal coher-ence is measured by computing the "relative saliency" in the successive frames and the "absolute saliency" in an individual image. A Conditional Random Field is constructed based on both saliencies, and the object is segmented by Graph Cut. This approach is a region-based solution, and each region is abstracted by Local Log-Euclidean Covari-ance Matrix. This kind of feature integrates every raw features together regardless of the regionâ€™s size and shape, enhancing the systemâ€™s robustness against deformation such as non-rigid object. Moreover, our online weight strategy in energy function can adjust the importance factor of each cue robustly according to the different scenes.4. To improve the systemâ€™s accuracy, a low-rank sparse representation based ap-proach is proposed in the thesis. By representing the elements in current frame as sparse linear combinations of dictionary templates, this algorithm capitalizes on the inherent low-rank structure of representations that are learned jointly. The coefficients of the con-strained representation will act as the measurement of the spatio-temporal coherence. Meanwhile, an adaptive dictionary is proposed to enhance the systemâ€™s robust against occlusion. Combining with the proposed online weight strategy, the object is tracked and segmented automatically in the system. Moreover, an extension approach is proposed in terms of multi-object tracking without increasing the systemâ€™s learning complexity.

Keywords/Search Tags:

video, object tracking and segmentation, Mixture of Regression, saliency, low-rank sparse representation

PDF Full Text Request

Related items

1	Research On Video Object Tracking Based On Low Rank Sparse
2	Research Of Low-Rank Sparse Representation Based Object Tracking Algorithms
3	Research Of Online Visual Object Tracking Algorithms Based On Sparse Representation
4	Learning Low-rank And Sparse Representation For Video Object Extraction And Tracking
5	Research On Object Low-Rank Tracking Algorithm Based On Compressed Feature
6	Study On Technology Of Video Analysis For Intelligent Surveillance
7	Algorithm Research On Discriminative Sparse And Low-Rank Representation
8	Salient Object Detection Based On Sparse Subspace Clustering And Low Rank Representation
9	Object Tracking Based On Sparse Representation
10	Research On Algorithms Of Automatic Video Object Segmentation In Complex Scenes