Expressive speech-driven facial animation

Posted on:2006-11-16

Degree:Ph.D

Type:Thesis

University:University of California, Los Angeles

Candidate:Cao, Yong

Full Text:PDF

GTID:2458390005993238

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

Facial animation is an essential component of many applications that involve realistic virtual human. However, realistic facial animation remains one of the most challenging problems in computer graphics. In this dissertation, we present a novel approach for automatically synthesizing expressive speech-driven facial animation. Our approach relies on a database of high-fidelity recorded facial motions, which includes speech related motions with variations across multiple emotions. The input of our system is a spoken utterance and a set of emotional tags. These emotional tags can be specified by a user or extracted from the speech signal using a classifier. Our system outputs a realistic facial animation that is synched to the input audio and conveys faithfully the specified emotions.; The contributions of our work are primarily twofold. First, we propose a speech motion synthesis approach that generate realistic lip motion that matches input speech. Second, we propose an emotion mapping approach that allow us to control expressive visual behavior during speech.; We introduce a novel representation of a recorded facial motion database, called the Anime Graph. Given an input utterance, our lip-synching module searches into the anime graph for a matching facial motion, while satisfying a set of proposed criteria. We also present a greedy search algorithm that yields vastly superior performance over most motion-graph based algorithms. The time complexity of the proposed algorithm is linear with respect to the size of an input utterance. In our experiments, the synthesis time for an input sentence of average length is under a second.; To control expressive visual behavior during speech, we propose an emotion mapping approach. First, using independent component analysis, a facial motion can be decomposed into two types of components: emotion (style) and speech (content). We then collect a set of speech related motions that have the same speech content but differ in emotion. By learning from the emotion components of these motions, we build a mapping function that can map a speech-related motion from one emotion space to another.

Keywords/Search Tags:

Speech, Facial, Motion, Expressive, Realistic

PDF Full Text Request

Related items

1	A facial animation model for expressive audio-visual speech
2	Data-driven facial animation synthesis by learning from facial motion capture data
3	3d Facial Animation Based On Mpeg-4 Standard
4	Nonrigid motion modeling and analysis in video sequences for realistic facial animation
5	The Modeling Research For Speech Emotion Towards Expressive Speech Synthesis
6	Multimodal analysis of expressive human communication: Speech and gesture interplay
7	Expressive Motion Editing Using Motion Extrema
8	Realistic 3d Facial Expression Animation Design And Realization
9	Research On Realistic Face Video Coding At Low Bit Rate
10	A Realistic Human Face Mouth Animation Research