Font Size: a A A

Research On Visual Tracking Methods Based On Pixel-level Probabilistic Model

Posted on:2020-04-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:W J ZhangFull Text:PDF
GTID:1488306107955179Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Visual tracking is the basis for video content analysis and understanding and has extensive application in security,military,human-computer interaction,intelligent devices,and other fields.Thus,it is of great research significance and application value to improve the robustness and accuracy of the tracking algorithm.Although remarkable progress has been made in recent years,visual tracking is still confronted by challenging problems like target deformation,complex motion,and background clutter.The target representation of state-of-the-art methods focuses on the modeling of the global target structure,making them suitable for tracking non-rigid objects with stable structures.However,for targets undergoing serious deformation,The model's ability to capture variations of shape,motion,and appearance of the target is limited by the strict global geometric constraints,making it difficult to obtain stable and accurate tracking results.To solve the above problems,this thesis aims to study a target representation model with better generalizability and corresponding tracking methods,focusing on dealing with target deformation caused by multiple factors and the problem of appearance modeling.The contributions of this thesis are as follows:In terms of target representation,this thesis proposes a pixel-level probabilistic model that is suitable for representing targets with deformation.First,the tracking problem is formulated as a pixel-level maximum a posteriori estimation problem,which combines the temporal consistency constraint and the appearance model at the pixel level.A temporal consistency model based on pixel matching is proposed,and a probability transfer function is defined according to the optical flow calibration error between successive frames,which integrates historical estimation results and reduces the influence of estimation error from the single-frame appearance model.Furthermore,we propose a method for generating the output score of the traditional model from the pixel-level probability,resulting in a multi-representation fusion tracking framework that enables traditional methods to integrate the pixel-level discriminative information.Experimental results show that the proposed method outperforms state-of-the-art methods in tracking highly deformable objects,and the multi-representation model significantly improves the overall performance of its baseline in the general object tracking.A pixel-level probabilistic inference method based on target/background motion correlation modeling is proposed to deal with the problem of target deformation and appearance change caused by the coupling of complex motions of the platform and the target.First,we analyze the relationship between motion and image observation from the perspective of imaging and reduce the parameter space of motion estimation to ensure real-time performance.Second,a local and then global optical flow estimation method is proposed to obtain the motion field in the target neighborhood with the object's edge preserved.Under the framework of Bayesian theory,this thesis further introduces pixel-level latent variables to represent the probabilities of pixels belonging to the target and latent variables representing motions of both the target and the camera and models visual tracking as the process of alternatively estimating these hidden states.By considering the relationship between the current state and the historical estimation of these hidden variables,the consistency constraint in the time domain is used to guarantee the reliability of the state distribution prediction and make the problem solvable.Both decoupling of the motion parameters and accurate pixel-level target probability estimation are thus obtained.The proposed model significantly improves the robustness of the algorithm to the complex motion of the platform and target.To increase the robustness of the appearance model against background clutter caused by target deformation,this thesis introduces the problem of saliency estimation of the target neighborhood into the visual tracking task,and jointly models the two problems.On the one hand,by considering historical information in the tracking scene,a novel spatial-temporal saliency model is proposed,which unifies a saliency transfer model based on optical flow int the temporal domain and a pixel-level observation model based on background distance and online learning in the spatial domain.On the other hand,saliency is introduced as reliability weights to describe the importance of visual features in the target representation model,which makes the target representation more accurate,which effectively inhibits the interference of background clutter,and improves the robustness of the algorithm against target deformation.Experimental results show that this method is highly reliable for long term tracking tasks with multiple challenging factors.
Keywords/Search Tags:visual tracking, target representation, pixel-level probabilistic model, motion analysis, saliency estimation, Bayesian inference
PDF Full Text Request
Related items