Font Size: a A A

A Research On Visual Synthesis Of Chinese Emotional Speech

Posted on:2010-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:X B GengFull Text:PDF
GTID:2178360278472439Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the development of human-computer interaction and the application of speech synthesis, a higher demand is proposed. The friendliness and convenience of human-computer interaction will be improved if we can see video when hearing voice. If synthesized speech can imitate speaker's emotion, the naturalness of synthesis speech will be improved a lot.In this paper, attentions are paid much to emotional speech synthesis and visual speech. An emotional corpus is constructed which has only emotional sentence. It contains happy, anger, sad and surprise. Prosody model based on ANN is applied to improve the naturalness of synthesizing. A TTS is also constructed to synthesis emotional speech by waveform concatenation.Images mosaic is used for visual speech synthesis. Different phonemes corresponded to different images and each emotion has twelve images. Morphing between images is based on Biharmonic Equation. After selecting key points and distorting the mapping, cross-integration is used to generate key frames.To verify the performance, a simplified TTVS system is constructed in this paper. The listening/visual test indicates that output speech is natural and morphing between images is smooth. We can make the morphing more smoothly by adding the number of key points.
Keywords/Search Tags:Speech Synthesis, Visual Speech, Emotional Speech, ANN
PDF Full Text Request
Related items