Font Size: a A A

Research And Implementation Of Chinese Prosodic Structure Prediction Model

Posted on:2015-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2298330422977673Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Speech synthesis is a technology that can convert text to speech. Currently, speechsynthesis technology has become an important means of human-computer interaction,and has been widely used in many fields. Generally, the steps of converting text intospeech are: text processing-> rhythm processing-> voice generation. Text processingis to analyze the input text and extract the information needed in the latter steps. All ofpeople’s voice has a certain tone, pauses and pronunciation length which are calledspeech prosody. Voice generation is the step of splicing the waveform based on theparameters gotton in the former two steps. To make the synthesized voice fully expressthe emotion contained in the text, making it closer to the human’s voice, prosodicanalysis must be done to the text. It is an important factor affecting the naturalness ofsynthesized voice, so the prosodic structure prediction is a very important part ofspeech synthesis. The method of prosodic structure prediction had changed from theartificial based rules method to the statistical model method. At present, it has becomean active research branch in the field of information science. This paper, on the basis ofthe analysis and comparison of several main prosodic structure prediction algorithms,focused on the maximum entropy prosodic structure prediction method and explored itin the aspect of combining laber rules with statistical model and the trainning method ofthe statistical model.This paper described the principle of maximum entropy, parameter estimation, thefeature template formulation, feature selection and other related content, designing theprosodic structure prediction model based on the maximum entropy. This paper tried toraise the effect of prosodic structure prediction model through introducing the artificialparsing into statistical model. In order to improve the training effect in the small-scalesample set, semi-supervised learning algorithm was introduced into the training ofprosodic structure prediction model, giving the designed prosodic structure predictionmodel a certain amount of self-learning ability. Finally, through the experimentalcomparison, this paper verified the feasibility of this improvement.
Keywords/Search Tags:Speech Synthesis, Prosodic Structure, Maximum Entropy, Semi-supervised Learning
PDF Full Text Request
Related items