Multi-modal Behaviors Data Mining For Virtual Human Synthesis

Posted on:2004-06-26

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Q Chen

Full Text:PDF

GTID:1118360185995657

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

To synthesize realistic virtual human multi-modal behaviors (speech, lip motion, face expression and gesture), the synchronization among these multi-modal behaviors is crucial, though the behaviors their self realistic-looking are also expected. This dissertation discusses how to apply and improve data mining method to this key problem in virtual human multi-modal behaviors synthesis. The contribution of the dissertation is as follow:1) On data preprocessing: an mpeg-4 based labeled face feature-tracking method is adopted to obtain audio-visual synchronization data. The method not only has advantage of avoiding the expensive equipments but also has ability of obtaining accuracy data that is in accordance with mpeg-4 standard. In audio-visual synchronization data segment, a new quantitative segment method is proposed that can segment the audio-visual data more simple. In audio-visual data preprocessing, a mepg4 labeled face feature points based face animation parameters generating method is adopted, this method explores possibility of extracting mpeg4 based face animation parameters (FAP) direct from video.2) On data feature extraction: a new mpeg4 based visual speech data feature expression method FAPP (face animation parameter pattern) is proposed. This dissertation demonstrates on how to apply unsupervised clustering and statistic methods to FAPP extraction. Base on a large amount of audio-visual data, 29 kinds of basic FAPP that can describe face motion characteristic and 15 kinds of basic orthodoxy vector that can synthesis FAPP are obtained. The experiment shows that the proposed visual speech feature expression method can effectively realize audio-visual data mapping and vivid face animation.3) On lip synchronization learning: Aiming towards lip synchronization problem in a speech driven face animation system, this dissertation addresses this complex many-to-many learning problem of how to design a learning model that can capture the audio and visual context information as well as real time animation. Two learning methods are proposed in this dissertation. One is FAPP based audio-toâ€“visual neural network mapping method, the other is Parameter Dynamic Transition Network (PDTN) based audio-to-visual real time mapping method. The fore one mainly considers how to realize real time and utilize audio context information. Base on clustering method and correlation frames forward and back, the proposed method can implement the mapping from speech feature vector containing context information to face animation parameter pattern. The later one has more advantage than the fore one. It has considered not only real time and audio context information, but also utilizes the statistic context information of lip motion and expression. The experiment shows our methods are effective which can greatly improve the realistic of lip synchronization in speech driven face animation system.4) On multi-modal behavior data synchronization learning: this dissertation addresses two...

Keywords/Search Tags:

Data mining, Machine learning Virtual human synthesis, Multi-modal behaviors, Synchronization, Prosody learning, Face animation, Sign language synthesis

PDF Full Text Request

Related items

1	Study Of Sign Language Synthesis Technology Based Upon Virtual Human Driven By Uighur Language Text
2	Dialect Sign Language Animation Synthesis On Gesture Data Analysis
3	Research On 3D Visible Speech Animation Driven By Prosody Text
4	Chinese Sign Language Synthesis Driven By Speech And Text
5	Chinese Sign Language Synthesis Based On Multi-Clues
6	Synthesis Of Sign Language Animation Based On Uighur Text
7	Research And Realization On Acquiring Motionemes For Sign Language Synthesis
8	Facial Perception: Learning-Based Face Tracking And Synthesis
9	Facial Attribute Estimation And Age Synthesis
10	Face Image Synthesis Method Research And Application Using Machine Learning Based Image Generation Algorithm