Font Size: a A A

Construction Of Dynamic Visemes Model Of Tibetan Vowels In Lhasa Dialect Based On Facial Motion Capture

Posted on:2022-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZhangFull Text:PDF
GTID:2505306746451994Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of modern computer technology and animation technology,people put forward higher requirements for the realism of 3D lip animation.At present,the facial motion capture system can synthesize more realistic facial animation,and has been widely used in the field of film and animation.Based on the facial motion capture system,this thesis collects the pronunciation motion data of Tibetan Lhasa vowels.Through dynamic viseme analysis and the construction of two dynamic viseme models,the input voice data into the model can generate lip sequence motion data,which is used to drive the The three-dimensional virtual lip shape is deformed to form a three-dimensional lip-shaped animation that matches the voice and action,and finally the dynamic viseme voice animations of the vowel in Tibetan Lhasa dialect are synthesized,which has achieved good results.The main work is as follows:(1)Construct a data set and performed dynamic viseme analysis.Collecting and processing Tibetan Lhasa vowel pronunciation data.By using the K-means clustering algorithm,the Tibetan Lhasa vowels are divided into 25 static lip-shaped categories,and the Tibetan Lhasa vowel pronunciation is divided Clustered into 19 dynamic viseme classes.(2)Constructed 2 dynamic viseme models of Tibetan Lhasa vowels based on LSTM Network and GAN.By inputting Tibetan Lhasa vowel speech data into the models,it is possible to generate three-dimensional lip-shaped motion sequence data of Tibetan Lhasa vowels.By calculating the root mean square error value,the Tibetan Lhasa vowel dynamic viseme model based on the GAN can achieve a root mean square error of 0.11,which is only 0.11 cm error compared to the real Facial Motion Capture data.Compared with the Tibetan Lhasa vowel dynamic viseme model based on the LSTM Network,the closeness of each point to the real data is significantly improved,and the total root mean square error is improved by 0.03.(3)Established a three-dimensional face model with Blendshapes,created 25 static lip shape Blendshapes related to dynamic viseme of Tibetan Lhasa dialect vowels,and inputted motion sequence data in the pronunciation phase generated by the Tibetan Lhasa vowel dynamic viseme model.B-spline interpolation between the two ends of the data and the natural closed lip.The dynamic viseme 3D lip animation synthesis of Tibetan Lhasa dialect vowels is realized,and finally synthesized 19 dynamic viseme voice animations in Tibetan Lhasa.
Keywords/Search Tags:Dynamic Viseme, Tibetan Lhasa, Motion Capture, Face Animation
PDF Full Text Request
Related items