Research On HMM-Based Cross-Lingual Speech Synthesis

Posted on:2012-02-08

Degree:Master

Type:Thesis

Country:China

Candidate:H Liu

Full Text:PDF

GTID:2178330338491947

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

As the international communication is becoming more and more popular these years, people who come from different countries tend to communicate with each other using different languages. To meet these requirements, a practical cross-language speech synthesis system is in need. How to complete cross-lingual speaker adaptation with the absence of target language data, and so as to realize a cross-lingual speech synthesis system to facilitate international communication, is the focus of our research work. This thesis is organized as follows:Chapter 1 briefly describes the research background, and then introduces the two main traditional corpus based speech synthesis methods. There is also a general description of the hidden Markov model (HMM) based intra-lingual model adaptation technique.Chapter 2 gives an introduction on the state-of-the-art technique in statistical parametric speech synthesisâ€”the HMM-based Trainable text-to-speech (TTS) speech synthesis method, including basic framework and some key technical points. Then a detailed intra-lingual model adaptation framework and related algorithms are presented. These are the foundation of the following research work.Chapter 3 focuses on improving the conventional Trainable TTS system. This is achieved by optimizing the decision tree based model clustering part of the existing baseline system. We adopt different criteria to conduct the splitting process and to determine the stopping point, the effect of different combinations of these criteria are carefully evaluated.Chapter 4 discusses how to achieve the task of cross-lingual model adaptation. The simple method based on phoneme mapping is revised and improved by using a more accurate phoneme mapping table combined with data-selection, and cross-lingual prosodic information mapping is introduced to make use of prosodic information. Experiments show the good performance of the proposed method.Chapter 5 gives an implementation of a cross-lingual speech synthesis system. This system can synthesize any Chinese and English voices, simulating a target Chinese speaker's voice characteristics, even if this speaker cannot speak English.

Keywords/Search Tags:

HMM, Speech synthesis, Decision tree clustering, Cross-language, model adaptation

PDF Full Text Request

Related items

1	Research Of Chinese Speech Synthesis Technology Based On Speech Database
2	Study On HMM-Based Chinese Speech Synthesis
3	Research On Cross-corpus Speech Emotion Recognition Technology Based On Transfer Learning
4	Research On Mandarin-Tibetan Cross-lingual Speech Synthesis
5	A Research On Speech Synthesis Based On Statistical Modeling And Pronunciation Error Detection
6	Study And Improve On The Mongolian Speech Recognition System
7	Research On Statistical Parametric Speech Synthesis Integrating Speech Production Mechanisms
8	Research On Statistical Parametric Mandarin-Tibetan Cross-lingual Speech Synthesis
9	Cross-lingual Speech Synthesis Based On Statistical Models
10	Research On Speech Synthesis Of Dungan Language