The Application Of HMM In Parameter-Based Text-To-Speech System

Posted on:2009-05-01

Degree:Master

Type:Thesis

Country:China

Candidate:J Du

Full Text:PDF

GTID:2178360242995321

Subject:Biomedical engineering

Abstract/Summary:

PDF Full Text Request

Text to Speech (TTS) is a key part of human computer interface. With the rapid development of the computing, it evolved from concatenative systems to the parameter-based TTS systems. The main topic of this thesis is the application of hide markov model in the parameter-based TTS systems. The detailed research woks in this thesis are as follows:First of all, based on the analysis of the available framework of automatic speech segmentation, we propose a new idea of variable length unit model to improve the accuracy of the segmentation. Automatic speech segmentation is widely used when developing a TTS system. Recently, HMM based automatic speech segmentation systems usually use context dependent phoneme, such as triphone as the acoustic model. In this thesis, we propose a variable length HMM based on both phonetic characteristics and the amount of training data to improve the segmentation accuracy, due to the extent of phonetic integration between several adjacent phonemes, coarticulatory effect and phoneme variation. We define the concept of variable length HMM, discuss the criteria of unit selection and the way to build and train the models. The results on an English speech corpus show that the segmentation accuracy increases from 79.55% to 89.13% for long speech units containing closer coarticulatory effect and certain improvement can be achieved in overall segmentation accuracy by using the variation length HMMs. In conclusion, the variation length HMM can achieve higher segmentation accuracy than triphone model, especially for those long speech units. The analysis and improvement on automatic segmentation lays a solid foundation for the follow-up work on parameter-based TTS system.Moreover, based on the available model training and parameter generation techniques, we analyze several key techniques in the whole technique framework and build a parameter-based TTS system for Chinese language. We construct a whole framework of parameter-based TTS system, which including an automatic training procedure and a synthesis back-end, and it can be quickly constructed under this framework by training with the input speech data. Next, to certificate the effect of this framework, a parameter-based TTS system for Chinese language is built. Furthermore, we improve the performance of the system by introducing a minimize generation error (MGE) criterion to guide the model training.Finally, the application of parameter-based TTS system in speaker conversion is presented. In its application, we use the MLLR adaptation for HMMs. In this method, we use certain amount of training data from certain speaker to adapt the speaker independent model to speaker dependent one. Thus, we can construct a TTS system for certain speaker.

Keywords/Search Tags:

speech synthesis, hide markov model, automatic speech synthesis, speaker conversion

PDF Full Text Request

Related items

1	Research And Implementation Of Speech Synthesis Method For Helping Old Robots
2	Research On Personalized Speech Synthesis Based On Deep Speech Representations
3	Research On Statistical Parametric Mandarin-Tibetan Cross-lingual Speech Synthesis
4	Research On Statistical Parametric Speech Synthesis Integrating Speech Production Mechanisms
5	A Study On Speech Synthesis And Visual Speech Synthesis Based On Neural Networks
6	Research On Emotional Speech Synthesis Based On Deep Neural Network
7	Research On Automatic Labeling Of Speech Synthesis Corpora
8	Based Hmm Can Be Training Vietnamese Speech Synthesis System
9	Mandarin Speech Synthesis System And Rhythm Adjustment
10	Research And Implementation Of Speech Synthesis Based On Fastpeech