Font Size: a A A

Research On Unit Selection In Large-Corpus Based English Text-to-Speech System

Posted on:2007-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:D Y PeiFull Text:PDF
GTID:2178360182978498Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
People's requirement for a more convenient way of human-computer interaction provides a good opportunity for the development of Text-to-Speech (TTS) technology. In the past few years, large corpus-based concatenative synthesis has developed to be a widely used method which greatly improved the naturalness of synthesis speech. As a critical part of this method, unit selection begins with a phonetic and prosodic specification for a desired utterance and searches the corpus for a unit sequence which is the best fit to the specification. Study on unit selection includes the identification of basic synthesis units, the definition of appropriate cost functions, the way to improve efficiency by reducing computational complexity, etc.Due to the demand of bilingual (Chinese and English) speech synthesis, the thesis' work is grounded on the development of a large corpus-based English TTS system, and centered on unit selection, which involves the study on coarticulation in concatenative synthesis, CART-based unit pre-selection and hybrid unit-based selection algorithm. The achievements are as follows:1. The effect of coarticulation in real continuous speech of English is studied and phone sequences with strong coarticulation are concluded by perception experiment. Based on this conclusion and the characteristic of multi-syllable and infinite-vocabulary in English, the synthesis units of the system are identified and a multi-layer hybrid unit model is constructed;2. Unit pre-selection is realized by the introduction of CART. For each basic synthesis unit, a decision tree is constructed offline, which is a map between the context and the prosody of the unit. These decision trees are used to preselect the units in online unit selection, increasing the overall efficiency;3. A hybrid unit-based selection algorithm is designed, based on the proposed corpus structure which is easy for candidates search, a costfunction including target cost and concatenation cost with different weights for different basic synthesis units, and a way to find the best sequence from all the candidate units;4. To enlarge the original speech corpus, the idea of creating a new speech corpus for some common words (which do not exist in the original corpus) in English is proposed, with a goal to ensure the overall synthesis quality and further improve the efficiency. The way to do so is presented in the thesis.Experiments indicate that using hybrid unit model for concantenative synthesis improves the naturalness of the synthesis speech, while the introduction of decision tree, the design of corpus structure and the creation of common words speech corpus increase the efficiency of unit selection. All these guarantee that the algorithm is effective and the TTS system is a real time one.
Keywords/Search Tags:English Text-to-Speech, Unit Selection, Coarticulation, Hybrid Unit, Unit Pre-Selection
PDF Full Text Request
Related items