Font Size: a A A

HMM-based Mandarin Speech Synthesis And Prosodic Optimized

Posted on:2013-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:K HuFull Text:PDF
GTID:2248330374966986Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Natural speech expresses non-linguistic information such as the speaker’sattitude, intention, emotion and personality, as well as linguistic information. Withthe growing demand of Text-to-Speech (TTS) technique, it is required thatsynthetic voices expressed not only linguistic information, but also non-linguisticinformation, which often expressed by prosodic features such as stress, rhythm andintonation. Therefore, it is of great significance to study prosodic model in TTS.This thesis studies the prosodic model of the statistical parametric speechsynthesis.HMM TTS has obvious advantages on flexibility, multilingual support andfootprint. The technique is studied and various configurations including speechunit, HMM topology, acoustic parameters and context information are carefullydesigned and selected. A HMM TTS system is built for mandarin synthesis.Context features affecting pronunciation and prosody are studied fromdifferent views including phone classification, retroflexed phone combination andprosodic hierarchy. Text transcriptions and question set including contextfeatures are designed to lay the foundation for prosodic modeling for mandarinHMM TTS.It is of great importance to optimize the prosodic model by consideringparticular conditions including language, speaker and text. From this point of view,F0distribution characteristics and multi-level duration model are introduced toHMM TTS. Experimental results demonstrated that these methods improve theprediction of prosodic feature and the naturalness of synthetic speech.
Keywords/Search Tags:Speech synthesis, Prosodic optimized, Hidden Markov Model, Fundamental frequency, Duration
PDF Full Text Request
Related items