Font Size: a A A

Statistical analysis of three-dimensional modeling from monocular video streams

Posted on:2003-08-21Degree:Ph.DType:Thesis
University:University of Maryland College ParkCandidate:Roy Chowdhury, Amit KFull Text:PDF
GTID:2468390011988258Subject:Engineering
Abstract/Summary:
3D scene modeling from a video sequence is considered to be one of the most important problems in computer vision. Its successful solution has numerous possibilities in applications like multimedia communications, surveillance, virtual reality, automatic navigation, medical prognosis, etc. One of the most powerful techniques for solving this problem is known as structure from motion (SfM). Briefly, the SfM problem is about recovering the absolute or relative depth of static and moving objects using video acquired from single or multiple video cameras. The most challenging problem is when only a monocular video is present and we require a dense estimate of the depth. Successful solution of this problem requires a detailed understanding of the geometry of the 3D world and its 2D projections on the image planes. However, the motion between adjacent frames of a video sequence is usually very small, thus introducing large errors in its estimation. Hence, in order to obtain a satisfactory solution, it is important to understand the statistics of these errors and their interaction with the geometry of the problem. The overall aim of this thesis is to show how to combine the statistics describing the quality of the input video data with an understanding of the geometry, in order to obtain an accurate 3D scene reconstruction from a video sequence using the optical flow model.; In our work, we pose the 3D reconstruction problem in an estimation-theoretic framework. We adopt the optical flow paradigm for modeling the motion between the frames of the video sequence. We show how the statistics of the errors in the input motion estimates are propagated through the 3D reconstruction algorithm and affect the quality of the output. We present a new result: that the 3D estimate is always statistically biased, and the magnitude of this bias is significant. In order to demonstrate our analysis in a practical application, we consider the problem of reconstructing a 3D model of a human face from video. An algorithm is proposed that obtains a robust 3D model by fusing two-frame estimates using stochastic approximation theory and then combines it with a generic face model in a Markov chain Monte Carlo optimization procedure. We address the question of how to automatically evaluate the quality of a 3D re-construction from a video sequence, and present a criterion using concepts from information theory. Finally, we propose a probabilistic registration algorithm that extends the results of our work to create holistic 3D models from multiple video streams.
Keywords/Search Tags:Video, Model, Problem
Related items