Font Size: a A A

Hidden Markov models for visual speech synthesis in limited data environments

Posted on:2002-05-19Degree:Ph.DType:Thesis
University:Air Force Institute of TechnologyCandidate:Arb, Harold AllanFull Text:PDF
GTID:2468390011992420Subject:Engineering
Abstract/Summary:
This research presents a new approach for estimating control points used in visual speech synthesis. First, Hidden Markov Models (HMMs) are estimated for each viseme present in stored video data. Second, models are generated for each triseme (a viseme plus the previous and following visemes) in the training set. Next, a decision tree clusters and relates states in the HMMs that are similar in a contextual and statistical sense. The tree also estimates HMMs for trisemes not present in the stored video data. Finally, the HMMs are used to generate sequences of visual speech control points for trisemes not occurring in the stored data. Statistical analysis indicates that the mean squared error between the desired and estimated control point locations is lowest when the process is conducted with certain HMMs trained using short-duration dynamic features, a high log-likelihood threshold, and a low outlier threshold. Also, comparisons of mouth shapes generated from the artificially generated control points and the control points estimated from video not used to train the HMMs indicate that the process estimates accurate control points. The research presented here thus establishes a practical method improving audio-driven visual speech synthesis quality.
Keywords/Search Tags:Visual speech synthesis, Control points, Hidden markov models, Stored video data
Related items