Font Size: a A A

Research On HMM-Based Cross-Lingual Speech Synthesis

Posted on:2012-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:2178330338491947Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As the international communication is becoming more and more popular these years, people who come from different countries tend to communicate with each other using different languages. To meet these requirements, a practical cross-language speech synthesis system is in need. How to complete cross-lingual speaker adaptation with the absence of target language data, and so as to realize a cross-lingual speech synthesis system to facilitate international communication, is the focus of our research work. This thesis is organized as follows:Chapter 1 briefly describes the research background, and then introduces the two main traditional corpus based speech synthesis methods. There is also a general description of the hidden Markov model (HMM) based intra-lingual model adaptation technique.Chapter 2 gives an introduction on the state-of-the-art technique in statistical parametric speech synthesis—the HMM-based Trainable text-to-speech (TTS) speech synthesis method, including basic framework and some key technical points. Then a detailed intra-lingual model adaptation framework and related algorithms are presented. These are the foundation of the following research work.Chapter 3 focuses on improving the conventional Trainable TTS system. This is achieved by optimizing the decision tree based model clustering part of the existing baseline system. We adopt different criteria to conduct the splitting process and to determine the stopping point, the effect of different combinations of these criteria are carefully evaluated.Chapter 4 discusses how to achieve the task of cross-lingual model adaptation. The simple method based on phoneme mapping is revised and improved by using a more accurate phoneme mapping table combined with data-selection, and cross-lingual prosodic information mapping is introduced to make use of prosodic information. Experiments show the good performance of the proposed method.Chapter 5 gives an implementation of a cross-lingual speech synthesis system. This system can synthesize any Chinese and English voices, simulating a target Chinese speaker's voice characteristics, even if this speaker cannot speak English.
Keywords/Search Tags:HMM, Speech synthesis, Decision tree clustering, Cross-language, model adaptation
PDF Full Text Request
Related items