Font Size: a A A

Speech-driven Animation Based On Actor-Critic Method

Posted on:2021-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LuoFull Text:PDF
GTID:2428330623967790Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speech-driven automatic lip animation synthesis is essential for many applications,while traditional face capture methods require expensive equipment and remain time-consuming.The focus of this article is to generate a lip-shaped animation that matches the audio through a given audio and character model,which can accurately reflect the synergy between the entire lip and voice and the emotional expression during the voice.At the same time,the pursuit of the entire system can generalize the role model,so that it can be quickly applied to multiple character models.Our framework is based on a reinforcement learning Actor-Critic network.Our Actor model directly uses the various acoustic features of the input speech sequence and the facial deformation parameters of the 3D mixed shape model of the face character animation as our state input and predict the facial deformation parameters of the next time step.The Critic network designs a reward function for the behavior in the current state,and uses ground truth data as input for optimization.The temporal difference algorithm is applied to make the generated lip animation continuously approach the face deformation in the real state.We use MFCC as audio features,which can not only present contextual information very effectively,but also reflect the emotional state of the speaker in the entire sequence.In addition,we extract the action units and comprehensively consider the facial motion coding coefficients of the character's full face,making the entire character's expression look more realistic and matching emotions during the audio,the model is able to learn contextual information and potential representations of emotional states in speech over time.Experimental results on a real audiovisual corpus that contains speech under various emotions show that our method performs well in terms of lip-matching and temporal smoothness.Experiments on various audiovisual corpora of different actors under various facial actions and emotional states show that the lip animations simulated by our method are more accurate and realistic.Because it is independent of the character's face model,the general model is easily applicable to various tasks in human-computer interaction and animation.
Keywords/Search Tags:Actor-Critic, temporal difference algorithm, MFCC, action units
PDF Full Text Request
Related items