Font Size: a A A

Research Of Multiple Variable Tracking And Assembly Parsing Based On Videos

Posted on:2016-06-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:G F WangFull Text:PDF
GTID:1108330482963661Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology, computer vision becomes more and more popular. On the one hand, for some basic vision problems, such as object tracking and detection, are still a hot spot research problems; on the other hand, a higher level problems have also been received more and more attention, e.g. video content parsing. Compared with basic vision problems, video content parsing requires more semantic information. However, it still relies on those underlying vision technologies. In this paper, we not only study some basic vision problems, but also research on the problem of a higher level. Specifically, we foucs on the object tracking and its related problems, from the primary object tracking to advanced assembly parsing problem. First, we study 2d visual tracking problem.2d object tracking is a basic tracking problem, and it is the main research direction in the field of tracking. The aim of 2d object tracking is to estimate the location information of the object in the consecutive video frames, and it has been widely used in the area of video surveillance, human computer interaction, and other fields etc. Secondly, we study the 3d visual tracking problem. Compared with 2d object tracking,3d object tracking needs to continuously estimate the 6DOF between the object and camera and it has been widely used in the augmented reality, robotics, and visual servoing etc. Finally, we study the technique of automatic assembly parsing problem based on the video. Automatic assembly parsing technology refers to that the system automatically analyzes the user’s assembly process, and guides users assemblying objects online. It not only needs to identify and track the objects in the scene, but also needs to understand the assembly relationship between the objects, so this is a higher level of vision problems. In this paper, the above three kinds of problems are sudied deeply.The main contributions are as follows:1. Proposing a 2d object tracking model based on sparse and local linear coding. The state search is an important component of any object tracking algorithm and usually, the particle filter is widely used. In order to estimate the state of the object, the particle filter needs to estimate the weight of each particle seperately, thus, no doubt that it is not an efficive search strategy. In this paper, we propose a tracking algorithm based on a novel stochastic sampling algorithm for an effective and efficient state search. The target appearance of a frame is modeled by a linear combination of the observations corresponding to particles drawn stochastically in an image, thus, the state space of particle observations from discrete to continuous; the solution is determined accurately via iterative linear coding between two convex hulls. Besides, the algorithm is also very flexible and can be combined with many generic object representations.2. Proposing a 3d object tracking algorithm based on global optimization search algorithm. Though edge-based tracking is fast and plausible, correspondence errors are common in images with cluttered backgrounds. In this paper, we propose a new method based on global optimization for searching these correspondences in the textureless 3d object tracking. With our search mechanism, a graph model based on an energy function is used to establish the relationship of the candidate correspondences. Then, the optimal correspondences can be efficiently searched with dynamic programming. Compared with local optimal searching, the global optimal searching strategy is more robust, and it is more effective for the complex 3D object models and highly cluttered backgrounds.3. Proposing an automatic video assembly parsing technique. The analysis of video assemblies is challenging due to ambiguity in the identification and tracking of objects as observed in a video. We introduce a tree-based global-inference technique. Our key idea is to incorporate part-interaction rules (PIRs) as powerful constraints which heavily regularize the search space and aid in correctly parsing the assembly video at interactive rates. Besides, pruning is based on an aggregated assembly confidence rather than making local decisions in each frame independently. By taking the assembly sequence history into account, the robustness of the algorithm increases.
Keywords/Search Tags:Visual Tracking, 2D Object Tracking, 3D Object Tracking, Video Assembly Parsing
PDF Full Text Request
Related items