Chinese Speech Synchronized Realistic 3D Facial Animation

Posted on:2009-09-29

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W Zhou

Full Text:PDF

GTID:1118360242495828

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

Realistic synchronized speech facial animation is a heated issue in the field of Computer Graphics and has a lot of applications in Human-Computer Interfaces, Entertainment, Film & Television Production, and Virtual Reality, etc. In the past 30 years, great progress and developments have been made in speech animation. However, at present, speech animation still has a lot of problems. Therefore, how to obtain synchronized speech driven realistic facial animation is a challenge subject which concerns so many problems including the kinematic and dynamic modeling and representation of individualized face, the mechanism of co-articulation and the acoustic and perceptual evaluation of realistic synchronized speech facial animation.In this paper, we study the synchronized speech facial animation from the following aspects.Firstly, based on the Waters' muscle model, a novel lip muscle model is proposed in this paper. Establishing muscle model for human facial animation is a simple and useful approach. However, too simple muscle model, like Waters' muscle model, can not describe some complicated moving facial expressions naturally. So we proposed a new lip muscle model, which perfects the description of the complicated lips' muscle movements which are not accurate in the Waters' model. According to facial anatomy, the global lip movement is divided into a few sub-movements. These sub-movements are the basic units for the description of the global lip movement. The reconstruction of the lip movement is based on the linear combination of the sub-movements. In the application of modeling talking face, several feature points are marked to get a group of lips parameters. All kinds of lip shapes are synthesized by using the proposed lip muscle model and the adjacent linear muscle model. The experimental results show that the proposed model is practical in view of its low computational cost and ability of producing all kinds of realistic synthesized lip shapes.Secondly, based on the previous researches on Chinese mandarin triphone model and co-articulation, a context-dependent visual speech co-articulation model is proposed in this paper. This approach combines the advantages of rule-based and learning-based methods to get realistic speech animation. Our presented model focuses on the visual effect of Chinese mandarin co-articulation. In order to get the key synthesized lip shapes in continuous speech, the rule set of the visual speech co-articulation is constructed and the phones' corresponding visemes weights are calculated by the quantized rule set. We synthesize a sequence of phones' corresponding lip shapes by using our muscle-based facial model. To produce realistic speech animation, a learning-based approach is used to acquire optimal synthesized transition lip shapes between two phones from all possible selections.Thirdly, a novel lip movement model related to speech rate is proposed in this paper. In continuous speech, speech rate has a strong effect on the velocity and amplitude of lip movement. At different speech rates, different people select different strategies of lip movement. For increased rate, some speakers decrease amplitude but maintain the velocity of the movement; others increase velocity while maintaining amplitude; and others make adjustments in both parameters. Therefore, according to the above research background, a novel lip movement model related to speech rate, which has high degree of individuality and naturalness, is proposed. According to the former researches, there exists a closed relation between EMG signal and speech rate as well as a relation between EMG signal and muscle force. Also, the area which covers lip muscle can be considered as an independent viscoelastic system. So the model is constructed based on the research results on the viscoelasticity of skin-muscle tissue and the quantitative relationship between lip muscle force and speech rate. In order to show the validity of the model, we have applied it to our Chinese speech animation system.Finally, in order to evaluate the quality of the synthesized speech animation system, a systemic evaluation approach of visual Chinese speech animation is proposed in this paper. Basically the approach consists of two main tests: acceptability test and intelligibility test. In acceptability test, the diagnostic acceptability measure approach has been used and the objective evaluation ingredient has been added. In intelligibility test, a novel approach called Visual Chinese Modified Rhyme Test, which is based on the previous Chinese Modified Rhyme Test in synthesized speech evaluation and focuses Chinese speech animation, has been proposed in this paper. At the same time, the factors of "punishment" and "forgiveness" are introduced to simulate the people's perception. At last, the synthesized evaluation result of the 3D speech animation system is concluded in this paper.According to the above researches, a Chinese Synchronized Speech Animation Demo System is constructed and a natural and realistic talking head is synthesized in this demo system.

Keywords/Search Tags:

facial animation, lip muscle model, sub-movement, visual co-articulation model, triphone model, viscoelasticity, skin-muscle tissue, EMG, acceptability test, intelligibility test, Visual Chinese Modified Rhyme Test

Related items

1	A Study On The Expression Synthesis For 3D Face
2	The Study On Key Technologies Of Realistic Chinese Visual Speech Synthesis
3	A Facial Expression Animation System
4	Realistic 3d Facial Expression Animation Design And Realization
5	Biomechanical Modeling And Muscle Force Prediction Analysis Of Human Upper Limb
6	Design And Application Of Bionic Muscle Based On Three Element Model Of Skeletal Muscle
7	Chinese Speech Synchronized3D Facial Animation
8	On the development of a clinical test of facial recognition based on a spatial frequency model of visual information processing
9	Research On Key Technology And Application Of McKibben Muscle For Soft Robot
10	Physiology Based Tongue Modeling And Speech Synchronized Animation Synthesis