Font Size: a A A

Hidden Markov Model-based Speech Synthesis Technology Research

Posted on:2007-08-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y J WuFull Text:PDF
GTID:1118360185451410Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the gradual improvement on the quality of synthetic speech, people have more requirements for the text-to-speech (TTS) system, especially the requirement for the diversification of synthetic speech. Due to this, it has high research value and application usage to study the new method, which can construct the TTS system in a short time by a trainable way, to satisfy the various needs of users. Therefore, this thesis studies the topic of the HMM-based trainable TTS in depth and systematically, including the framework construction, the key technology improvements and the related application. The detailed research works in this thesis are as follows:Firstly, the application of Hidden Markov Model (HMM) for speech synthesis is studied, with the focus on improving the HMM-based automatic segmentation. As the conventional HMM training method based on the Maximum Likelihood (ML) criterion is inconsistent with the segmentation application, the discriminative training method is adopted and a new criterion named Minimum Segmentation Error (MSGE) is introduced. In this new method, a loss function is defined by introducing a new measurement for segmentation errors. By minimizing the overall empirical loss with the Generalized Probabilistic Descent (GPD) algorithm, the segmentation error is also minimized. The analysis and improvement on HMM-based automatic segmentation lays a solid foundation for the follow-up work on HMM-based Trainable TTS.Secondly, based on the available HMM training method and parameter generation algorithm, the whole technique framework of trainable TTS is constructed, which include an automatic training procedure and a synthesis back-end. For the users?requirement, a corresponding TTS system can be quickly constructed under this framework by training with the input speech data. Moreover, to certificate the effect of the trainable TTS framework, a Chinese trainable TTS system is constructed by designing and optimizing the contextual feature and question set regarding to the Chinese characteristics.Thirdly, the baseline framework of trainable TTS is improved in several aspects. First, by analyzing the characteristics and the modeling effect of Mel-cepstral (MCEP) and Line Spectral Pair (LSP) parameter, and taking account of the relation between...
Keywords/Search Tags:speech synthesis, HMM, trainable TTS, minimum generation error
PDF Full Text Request
Related items