Font Size: a A A

Research On Key Technologies Of Database Construction In Uyghur TTS

Posted on:2013-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:S M J K D E KaFull Text:PDF
GTID:2218330374966378Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Database construction is an important part of the speech synthesis technology,which includes the text collection, selection of the typical text, recording, labeling,design and compression work of the speech database. This paper focuses on theautomatic labeling and compression technology.In order to achieve smaller capacity, better intelligibility and naturalness ofwaveform concatenation based Uyghur speech synthesis system, based on thecharacteristics of the Uyghur language, we designed texts, did recording and labeling.For reducing the manual workload and increasing the labeling accuracy, combinedwith the Uyghur phonetic feature, this paper used automatic segmentation of phoneticunit based on monophonic HMM and context dependent tri-phone HMM in thelabeling. Studied the usage of HMM and HTK, gained monophonic models andtri-phone models by training, and then realized the automatic phonetic segmentationof Uyghur speech. In the process of designing the speech corpus structure, we hadestablished the syllable corpus which takes the syllable as the basic synthesis unit; Forthe purpose of making up the synthesis problem of syllables which do not exist in thecorpus, a phoneme corpus which takes the phoneme as the synthesis unit has beenestablished. Experimental results show that, waveform concatenation based Uyghurspeech synthesis system, which takes syllable and phoneme as the smallest synthesisunit, proposed in this paper not only has relatively small size of corpus, but also hasgood intelligibility of synthesized speech.In order to reduce the space and decompression of voice without distortion, weimplemented the principle of lossless compression algorithm, selected Huffmancompression algorithm, because the advantages of this algorithm are that operation isfast, easy to achieve. We compress all the syllables and phonemes, but onlydecompress the selected candidate synthesis unit while synthesizing speech, and donot need to decompress the entire speech corpus.
Keywords/Search Tags:Uyghur, Speech synthesis, Corpus, Automatic segmentation, HMM, Datacompression
PDF Full Text Request
Related items