The Method And Implementation Of ToBI Automatic Prosodic Labeling In English Text To Speech System

Posted on:2017-02-16

Degree:Master

Type:Thesis

Country:China

Candidate:Y M Wang

Full Text:PDF

GTID:2308330488465243

Subject:Electronics and Communications Engineering

Abstract/Summary:

With the surging wave of Internet trend of blowing, speech synthesis technology is also taking advantage of the opportunity to stand in the air to achieve a rapid growth. As part of the artificial intelligence, the future direction of speech synthesis technology,is making machine to achieve the human voice synthesizer to speaking level. So as a key speech synthesis technology that expressing effect of rhythmic expression will obtain more and more attention. The paper develops discussion and research as for ToBI rhythm autolabels, and shows the effect of loading the English TTS after automatic annotation. Specific works are as follows:Firstly, the paper elaborates the background and historical development,and introduces a variety of speech synthesis methods of speech synthesis technology, including two synthetic approaches of the mainstream which are parameters synthesis based on HMM model and stitching synthesis based on large corpus. In view of the importance of ToBI system, chapter II the paper introduces it in details.Secondly, in the next chapters, paper will focus on the description of C4.5 prediction tree algorithm, maximum entropy algorithm and conditional random algorithm.And in the specific implementation process, paper introduces several training models and testing methods. Through analysis and comparison of different models and rhythms, we can use different models of prosody for automatic labeling, loading into the English TTS.Finally, the paper will attain a direct data result,through predicting a few different models.The resule shows C4.5 decision tree algorithm and CRF model can be effectively used to predicting and labeling ToBI system. When prosody prediction model is added,paper made a subjective MOS audiometry test as for the synthesized voice of English TTS.Compared with the previous MOS score, new sentence upgrade 0.31 which shows an improvement on the rhythm clearly. This further demonstrates experimental ideas and methods in paper are reliable.In addition, the paper summarizes the experimental results, and presents several optimization parts in ToBI autolabels, and some vision and recommendations in ToBI prosody prediction.

Keywords/Search Tags:

Speech synthesis, ToBI prosodic annotation, C4.5 decision tree CRF model, Prosody prediction

Related items

1	The Research On Dai Prosody Prediction Module Of Speech Synthesis
2	Chinese Speech Synthesis System Improvements And Implementation
3	An Improved Speech Synthesis Method
4	The Research Of Prosodic Control Algorithm And Realization For Chinese Speech Synthesis
5	Research On Problems Of Text-To-Speech System
6	The Study Of Prediction Methods On The Mongolian Prosody
7	Improvement Of Prosodic Structure Prediction In Speech Synthesis
8	Research On 3D Visible Speech Animation Driven By Prosody Text
9	Research And Implementation Of Chinese Prosodic Structure Prediction Model
10	Based On The Binary Semantic Annotation Of Waveform Concatenation Speech Synthesis