Hidden Markov Model-based Speech Synthesis Technology Research

Posted on:2007-08-02

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y J Wu

Full Text:PDF

GTID:1118360185451410

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

With the gradual improvement on the quality of synthetic speech, people have more requirements for the text-to-speech (TTS) system, especially the requirement for the diversification of synthetic speech. Due to this, it has high research value and application usage to study the new method, which can construct the TTS system in a short time by a trainable way, to satisfy the various needs of users. Therefore, this thesis studies the topic of the HMM-based trainable TTS in depth and systematically, including the framework construction, the key technology improvements and the related application. The detailed research works in this thesis are as follows:Firstly, the application of Hidden Markov Model (HMM) for speech synthesis is studied, with the focus on improving the HMM-based automatic segmentation. As the conventional HMM training method based on the Maximum Likelihood (ML) criterion is inconsistent with the segmentation application, the discriminative training method is adopted and a new criterion named Minimum Segmentation Error (MSGE) is introduced. In this new method, a loss function is defined by introducing a new measurement for segmentation errors. By minimizing the overall empirical loss with the Generalized Probabilistic Descent (GPD) algorithm, the segmentation error is also minimized. The analysis and improvement on HMM-based automatic segmentation lays a solid foundation for the follow-up work on HMM-based Trainable TTS.Secondly, based on the available HMM training method and parameter generation algorithm, the whole technique framework of trainable TTS is constructed, which include an automatic training procedure and a synthesis back-end. For the users?requirement, a corresponding TTS system can be quickly constructed under this framework by training with the input speech data. Moreover, to certificate the effect of the trainable TTS framework, a Chinese trainable TTS system is constructed by designing and optimizing the contextual feature and question set regarding to the Chinese characteristics.Thirdly, the baseline framework of trainable TTS is improved in several aspects. First, by analyzing the characteristics and the modeling effect of Mel-cepstral (MCEP) and Line Spectral Pair (LSP) parameter, and taking account of the relation between...

Keywords/Search Tags:

speech synthesis, HMM, trainable TTS, minimum generation error

PDF Full Text Request

Related items

1	HMM-based Trainable Speech Synthesis For Dai Language
2	Based Hmm Can Be Training Vietnamese Speech Synthesis System
3	Trainable Chinese Speech Synthesis System Based On HMM
4	Esearch On The Modeling And Generation Of Fundamental Frequencies In Statistical Parametric Speech Synthesis
5	A Research On Speech Synthesis Based On Statistical Modeling And Pronunciation Error Detection
6	Research On Speech Generation Of Dongxiang Dialect
7	Research On Acoustic Modelling And Text Generation In Concept-to-Speech Conversion
8	Research Of Personalized Speech Generation
9	Research On Statistical Parametric Mandarin-Tibetan Cross-lingual Speech Synthesis
10	Research And Implementation Of Speech Synthesis Method For Helping Old Robots