Improvement Of Prosodic Structure Prediction In Speech Synthesis

Posted on:2018-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:T H Wang

Full Text:PDF

GTID:2348330512993159

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

The prosodic structure is one of the key factors that affect the naturalness of speech in speech synthesis.The study of prosodic structure prediction becomes more and more important.The traditional prosody prediction modeling methods have achieved many successes in application,but it uses superficial information such as Part-of-Speech information when selecting input features,it ignores the influence of the deep semantic and grammatical information on prosodic structure.In addition,when the data complexity is very large,there will appear many problems such as narrow scope of application,over-fitting and over reliance on rules.Aiming at the limitations of traditional methods,we need a model with strong modeling capabilities for complex data,and the input of the model needs to represent deep information.In this thesis,we introduce the deep neural network prediction model based on the word embedding as the input feature in the prosodic structure prediction module.The main work of this thesis is as follows:(1)Using the trained word embedding instead of the traditional POS information as the input of the prediction model,adding the length information and the punctuation information into the input feature of the model,improving the learning effect of the model;(2)Modeling the prosodic prediction model with the network structure of the stacking feed-forward and bidirectional long short-term memory recurrent network layers,comparing the results of the prosodic prediction model under different network structures,and finding a better network structure to predict prosodic structure;(3)In order to further improve the prediction accuracy of the prosodic structure prediction based on the depth learning,after the network model,we use the output score of the network model and the transfer score between the prosodic structure categories to dynamically plan the output sequence of the prosodic level category labels.

Keywords/Search Tags:

Speech synthesis, Prosodic structure prediction, Deep learning, Word2vec, Deep Neural Network

PDF Full Text Request

Related items

1	Research On Prosodic Structure Prediction Based On Deep Neural Network
2	Research And Implementation Of Chinese Prosodic Structure Prediction Model
3	Research On Neural Network Based Statistical Parametric Speech Synthesis
4	Research On Automatic Labeling Of Speech Synthesis Corpora
5	Research On Deep Learning Based Small-Sized Unit Concatenation Speech Synthesis
6	Research And Impiementation Of Chinese Speech Synthesis Based On Deep Learning
7	Research On Neural Network-based Acoustic Modeling For Speech Synthesis
8	The Research Of Prosodic Control Algorithm And Realization For Chinese Speech Synthesis
9	Research On Emotional Speech Synthesis Based On Deep Neural Network
10	Study On Speech Synthesis Based On Deep Neural Network