The ability to infer human motion from video inputs is key to many vision-based tasks, such as intelligent surveillance, content-based video search, and human-computer interaction. A major difficulty of tracking people, however, is in maintaining a satisfactory approximation of the state distribution over time, given the nonlinear dynamics of human motion. Monte Carlo techniques like the particle filters are effective under a constrained domain, but statistical inefficiency renders these algorithms intractable in high-dimensional state spaces.; This thesis investigates the use of a dynamical Markov Chain Monte Carlo technique in recovering the 3D shape and motion of people from monocular video sequences. The proposed hybrid Monte Carlo filter uses an empirical, edge-based likelihood, and a second-order dynamcal model with soft bio-mechanical constraints. Sampling efficiency is improved by following gradients towards good hypotheses, while ensuring a properly weighted approximation to the posterior. Experiments using monocular sequences of people walking in cluttered backgrounds show that the hybrid Monte Carlo filter outperforms conventional particle filters in both accuracy and consistency. |