Font Size: a A A

A framework for automatic creation of talking heads for multimedia applications

Posted on:2003-07-19Degree:Ph.DType:Dissertation
University:University of WashingtonCandidate:Choi, KyoungHoFull Text:PDF
GTID:1468390011488687Subject:Engineering
Abstract/Summary:
In this dissertation, a framework for automatic creation of talking heads for various multimedia applications is presented. In this framework, we present a new audio-to-visual conversion algorithm that uses a constrained optimization approach to take advantage of the dynamics of mouth movements. Based on facial muscle analysis, the dynamics of mouth movements is modeled and constraints are obtained from it. The obtained constraints are used to estimate visual parameters from speech in a framework of HMM-based visual parameter estimation. The proposed constrained optimization approach finds visual parameters that satisfy given constraints and maximize the auxiliary function that used to train audio-visual HMMs. This approach enables the algorithm to produce reliable visual parameters even in noisy environments. Experimental results demonstrate that the proposed audio-to-visual conversion method is able to follow true visual parameters robustly in various noisy environments. In addition to the constrained optimization approach for robust audio-to-visual conversion, an automatic scheme to create a 3D head model is presented. In this scheme a probabilistic approach, to decide whether or not extracted facial features are appropriate for creating a 3D face model, is presented. Automatically extracted 2D facial features from a video sequence are fed into the proposed probabilistic framework before a corresponding 3D face model is built to avoid generating an unnatural or non-realistic 3D face model. We also present a face shape extractor, based on an ellipse model controlled by three anchor points, which is accurate and computationally cheap. To create a 3D face model, a least-square approach is used to find a coefficient vector that is necessary to adapt a generic 3D model into extracted facial features. Experimental results show that the proposed scheme can efficiently build a 3D face model from a video sequence without any user intervention for various Internet applications including virtual conference and a virtual story teller that do not require much head movements or high quality facial animation.
Keywords/Search Tags:3D face model, Framework, Automatic, Constrained optimization approach, Facial, Visual parameters
Related items