Font Size: a A A

Research On Emotion Speech Synthesis And Building Based On HMM

Posted on:2013-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2248330371990435Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the continuous development of speech synthesis technology, the synthesized voice is getting closer to the real human voice. To make the synthetic speech more humane and be better accepted by human ears, people began to hope that it can be rich in some kind of emotional factors. As a result, emotional speech synthesis was put out as a new research direction and has been developing rapidly.This paper first introduces the research actualities and the associated technical challenges of speech synthesis technology. Subsequently, the paper analyses several basic speech synthesis methods and makes compareation among them. Compared to other speech synthesis methods, based on Hidden Markov Model (HMM) speech synthesis methods is more convenient. First of all, this synthesis method could build a new system automatically through training in a short period. Secondly, it is without the need for manual operation in the process of training. Finally, the whole training process does not depend on the pronunciation, style and emotional types. Therefore, the HMM-based speech synthesis method is selected to synthetize emotional speech in this paper. This paper also analyzes the current popular emotion theory, and after investigating several sentiment classification methods confirms happiness, anger, sadness and calmness as the object of study in this paper. After analysing the current speech database and confirming speech corpus, recording environment and emotional types, an emotional speech database that included happiness, anger, sadness and calmness four emotional feelings is built in this paper to analyse emotional voices. By analyzing the emotional voices of the database, the variation of the acoustic characteristics of these four emotional voices was summed up. In addition, the characteristic parameters and their extraction methods are listed. After analysis, a complete emotional speech synthesis system is designed in the paper, which includes three parts:HMM-based trainable speech synthesis system, the emotional analysis module and parameter modification module, and some tests are done to confirm its feasibility and effectiveness.In addition, this paper does analysis and research on fundamental frequency jitter parameter for speech synthesis. Through experiments, some comparison is made between voices that synthesized by the original synthesis system and the new synthesis system contained jitter. The results show that the voices which are synthesized by the new synthesis system are more natural and smooth. Therefore, the research on fundamental frequency jitter is of great significance in the field of speech processing.
Keywords/Search Tags:Emotional Speech Synthesis, Emotional speech database, prosodicfeatures, Hidden Markov Model
PDF Full Text Request
Related items