Font Size: a A A

Object-based video processing with depth

Posted on:2001-08-07Degree:Ph.DType:Thesis
University:The University of RochesterCandidate:Sun, ZhaohuiFull Text:PDF
GTID:2468390014458173Subject:Engineering
Abstract/Summary:
The focus of this thesis is on investigating the importance of “depth” cue in digital video processing, identifying potential applications, and finding solutions in multidimensional signal processing, computer vision and computer graphics by crossing the boundaries between these disciplines. We study how to use depth cue explicitly and implicitly to facilitate efficient video representation, transmission and manipulation.; First, we take a model-based approach with explicit depth information for 3-D wireframe/mesh modeling from uncalibrated monocular video sequence. 2-D content-based and temporally coherent meshes are designed on video object across video frames by mesh design, motion tracking and verification. 3-D models are recovered by estimation of the depth of mesh nodes and keeping the connection of 3-D wireframe as the topology of the corresponding 2-D mesh. Knowledge of local structures is brought in as multiple geometric constraints in the closed-loop approach for interactive and iterative optimization of 3-D shape/motion and 2-D correspondence using projections onto convex sets (POCS). Error sensitivity issues associated with depth subject to input perturbation are also investigated, where the first order perturbation and covariance of image measures, instead of the worst performance bounds, are propagated to 3-D shape and motion estimates based on the theories of matrix perturbation and statistics.; Second, we investigate an image-based approach with implicit depth cue. Most video processing schemes use only 2-D motion models, such as global or local translational and affine models, and two consecutive video frames for processing purposes, which are inappropriate or inefficient for 3-D scene and unconstrained camera motion, because of the motion parallax as a result of model/scene mismatch. Instead, we use a true 3-D motion model and three consecutive frames at a time, thus overcome many problems associated with the model/scene mismatch and achieve better performances. Trifocal motion model, capable of capturing geometric constraints of 3-D scene and unconstrained camera motion of three perspective projections by trifocal tensor, is incorporated in MPEG-4 Video Verification Model as a tool for motion estimation, motion compensation, video compression, image registration and mosaic synthesis.; Finally, conclusions are drawn from our study and future research directions are discussed.
Keywords/Search Tags:Video, Depth, 3-D, Motion, 2-D
Related items