Font Size: a A A

The Application Of HMM In Parameter-Based Text-To-Speech System

Posted on:2009-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:J DuFull Text:PDF
GTID:2178360242995321Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Text to Speech (TTS) is a key part of human computer interface. With the rapid development of the computing, it evolved from concatenative systems to the parameter-based TTS systems. The main topic of this thesis is the application of hide markov model in the parameter-based TTS systems. The detailed research woks in this thesis are as follows:First of all, based on the analysis of the available framework of automatic speech segmentation, we propose a new idea of variable length unit model to improve the accuracy of the segmentation. Automatic speech segmentation is widely used when developing a TTS system. Recently, HMM based automatic speech segmentation systems usually use context dependent phoneme, such as triphone as the acoustic model. In this thesis, we propose a variable length HMM based on both phonetic characteristics and the amount of training data to improve the segmentation accuracy, due to the extent of phonetic integration between several adjacent phonemes, coarticulatory effect and phoneme variation. We define the concept of variable length HMM, discuss the criteria of unit selection and the way to build and train the models. The results on an English speech corpus show that the segmentation accuracy increases from 79.55% to 89.13% for long speech units containing closer coarticulatory effect and certain improvement can be achieved in overall segmentation accuracy by using the variation length HMMs. In conclusion, the variation length HMM can achieve higher segmentation accuracy than triphone model, especially for those long speech units. The analysis and improvement on automatic segmentation lays a solid foundation for the follow-up work on parameter-based TTS system.Moreover, based on the available model training and parameter generation techniques, we analyze several key techniques in the whole technique framework and build a parameter-based TTS system for Chinese language. We construct a whole framework of parameter-based TTS system, which including an automatic training procedure and a synthesis back-end, and it can be quickly constructed under this framework by training with the input speech data. Next, to certificate the effect of this framework, a parameter-based TTS system for Chinese language is built. Furthermore, we improve the performance of the system by introducing a minimize generation error (MGE) criterion to guide the model training.Finally, the application of parameter-based TTS system in speaker conversion is presented. In its application, we use the MLLR adaptation for HMMs. In this method, we use certain amount of training data from certain speaker to adapt the speaker independent model to speaker dependent one. Thus, we can construct a TTS system for certain speaker.
Keywords/Search Tags:speech synthesis, hide markov model, automatic speech synthesis, speaker conversion
PDF Full Text Request
Related items