Font Size: a A A

Research On Automatic Segmentation For Mandarin TTS System

Posted on:2008-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:X L YuanFull Text:PDF
GTID:2178360215482486Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Nowadays, corpus-based concatenative speech synthesis is the most widely used approach for synthesizing speech with high articulation and intelligibility. The accuracy of automatic segmentation in Mandarin TTS highly impacts the quality of the speech synthesis. Based on the reasons mentioned above, automatic segmentation in the Mandarin TTS system has attracted great attention in the recent years.The author did elaborate research on the current situation of automatic segmentation. Based on the status quo, the author employed main stream method of forced alignment on the basis of HMM to establish automatic segmentation baseline system. The author enhanced the baseline system by bringing forward a novel method of model adaptation. The results proved that the new method can improve the precision of automatic segmentation. Plenty of experiments were done to verify corresponding parameters' performance in automatic segmentation. The process of selecting parameters is ignored by most similar papers, however these parameters have been proved to have great impact on real system.Among the existing methods, most studies on automatic segmentation are based upon single model, which is either context-dependent or context-independent. An inherent problem of the single model method is that each boundary will achieve only one estimate, regardless of the fact that distinguish models perform diversely in the very boundary environment. In his paper, we proposed two methods to train the mapping rules between the acoustic models and the boundaries in the similar acoustic environment, then using the mapping rules to select the best model for each boundary.Firstly, we proposed a hybrid model method for automatic segmentation of Mandarin text-to-speech corpus. The boundaries of acoustic units are categorized into eleven phonetic groups. For a given phonetic group of boundaries, the proposed method will train the mapping rules between the boundary groups and acoustic models including initial-final monophone-based HMM(IFMM), semi-syllable monophone-based HMM(SSMM) and initial-final triphone-based HMM(IFTM).Secondly, making use of decision tree algorithm C4.5, a classification approach is proposed to train mapping rules between the boundaries which locate in the similar acoustic environments and IFMM, SSMM and IFTM, by which means the best estimation result for each boundary can be picked out.The experimental results show that both the hybrid model method and the decision tree classification method can achieve better performance than the single model method, in terms of accuracy and time shift of boundaries.
Keywords/Search Tags:TTS, HMM, Automatic Segmentation, Hybrid Model, Decision Tree
PDF Full Text Request
Related items