Research On Tibetan Lhasa Speech Synthesis Based On HMM

Posted on:2015-03-14

Degree:Master

Type:Thesis

Country:China

Candidate:J X Zhang

Full Text:PDF

GTID:2298330467474444

Subject:Computer application technology

Abstract/Summary:

In this paper, Lhasa Tibetan is the study object, the Lhasa dialectâ€™s speech synthesis has been achieved which is conditional on using the trainable speech synthesis (TrainableTTs) to be the basic structure of our speech synthesis system, which is based on HMM (Hidden Markov Model), and meanwhile with the previous preparation of data and the later model training and parametersâ€™ synthesis. The main work and results are as follows:First, building a small Tibetan Lhasa dialect speech corpus. With the characteristics of the Lhasa Tibetan consonants, vowels and tones, select approximately2000speech which are statement sentences from the speech data of Tibet Daily, which are used for the speech synthesis experiments. After the implementation of sub-word speech tags, labels and other prosodic phrase boundary for the selected sentences, label phonemes and prosodic with Praat software, and write the corresponding programs to product single phone and triphone label files which contains time.Secondly, studying on the automatic phoneme segmentation algorithm for single phone and triphone, and labelling the selected speech sentences with the triphone automatic phoneme segmentation algorithm. Testing and analysising the two accuracy rates of the results which are from the two different HMM (Hidden Markov Model), the overall average segmentation accuracy rate of these two are80.69%,88.74%, so it is shown that the accuracy rate of triphone automatic phoneme segmentation algorithm is significantly higher than the single phone automatic phoneme segmentation algorithmâ€™s, with the former one the accuracy and consistency of speech corpus annotation information has been improved.Again, according to the characteristics of Tibetan grammatical structure, rhythm and speech features, contextually relevant attribute rules and questions for decision tree clustering have been designed, the contextually relevant information have been labeled, and the of and Generalized data Mel Cepstral have been obtained.Finally, achieving the speech synthesis of Tibetan Lhasa dialect. Selecting the Tibetan phoneme as base synthetic element, and the relevant acoustic model can be obtained by the trainable speech synthesis which is based on HMM (Hidden Markov Model), after extracting the parameters of fundamental frequency, duration, MFCC, the nature of the speech synthesis can be taken on objective and subjective test and producing some relevant modification proposals. The average score for the speech synthesis MOS2.33.In short, the Tibetan synthesized has a certain intelligibility and a certain degree of recognition, which makes a bedding for the research and the development of Tibetan speech synthesis system.

Keywords/Search Tags:

speech synthesis, Hidden Markov Model, Lhasa Tibetan, model training

Related items

1	The Research On Segmentation Acoustic Model Based On MPE Tibetan Lhasa Dialect
2	Research On Tibetan Lhasa Dialect Speech Recognition Based On Deep Learning
3	Research On Tibetan Lhasa Dialect Speech Recognition Based On TANDEM Feature
4	Research On Mandarin-Tibetan Cross-lingual Speech Synthesis
5	Research On Method Of Unit Selection Speech Synthesis Based On Hidden Markov Model
6	Research On Acoustic Modeling Methods In Statistical Parametric Speech Synthesis
7	Research On Statistical Parametric Speech Synthesis Of Tibetan Lhasa Dialect
8	Research On Statistical Parametric Mandarin-Tibetan Cross-lingual Speech Synthesis
9	A Study On Lhasa Tibetan Prosodic Model Of Journalese
10	Based Hmm Can Be Training Vietnamese Speech Synthesis System