Font Size: a A A

Mpeg-4 Compatible Facial Speech Animation System And Its Applications In Network Communications

Posted on:2004-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:J B LvFull Text:PDF
GTID:2208360092970591Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
MPEG-4 is an object-based multimedia compression standard, which allows the encoding of different audio visual objects (natural or synthetic) in the scene independently. Face object, is a special visual object defined in MPEG-4. Facial definition parameter (FDP) and facial animation parameter (FAP) are the sets of parameters to calibrate and animate the face object. MPEG-4 enables integration of face animation with multimedia communications and allows the face animation over low bit rate communication channels.TTS (Text to Speech) is one of the promising synthetic audio tools provided by MPEG-4, and its integration with facial animation will definitely lead to lots of applications. MPEG-4 defines an application program interface for TTS synthesizer. Using this interface, the synthesizer can be used to provide phonemes and related timing information to the face model. The phonemes are converted into corresponding mouth shapes enabling simple talking head applications.Taking into account of previous effort of our lab, I have made a survey of current research status about facial animation, and then I choose A MPEG-4 compatible facial animation system with TTS support and its application in network communication as my research direction. Integration of facial animation with synthetic speech will not only be a new field for our research work, but also it will serve an important role in such applications as virtual newscaster and virtual communication over low bandwidth. So I have also developed two promising prototype systems, which are called "Grimace VTTS" and "Grimace Chat" correspondingly.This paper will focus on the following aspects:1. Standard, an overview of MPEG-4 standard and basic technology about facial object of MPEG-4 are presented.2. Technology support, OpenGL and TTS engine of Microsoft Speech SDK 5.0 are introduced in detail, and some practice and examples will also be discussed.3. Framework of Grimace system, the framework of Grimace VTTS (prototype aiming at virtual newscaster) and the framework of Grimace Chat (prototype aiming at virtual communication) are proposed and each module is described.4. Algorithms in Grimace system, many specific algorithms adopted and optimized for Grimace system are presented and discussed.5. Implementations and applications, the tools used in developing Grimace system are introduced, and functions and using methods of Grimace system are described in detail.6. Evaluation of Grimace system, both subjective evaluation and objective evaluation of Grimace system are presented.7. Platform requirements and future work, run-time platform requirements of Grimace system are introduced, then future directions and my suggestion of this prototype system are presented.
Keywords/Search Tags:MPEG-4, facial modeling, facial animation, TTS, texture mapping, virtual communication
PDF Full Text Request
Related items