Font Size: a A A

Research And Implementation Of Chinese Speech Synthesis System Based On Articulatory Feature

Posted on:2020-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q YinFull Text:PDF
GTID:2428330590471850Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Speech synthesis technology,also known as text-to-speech technology,can convert text into speech for output,which is an indispensable part of human-computer interaction.There are many artificial intelligence application scenarios including speech synthesis,such as smart speakers,smart homes and car navigation.The corpus-based units concatenation speech synthesis has high intelligibility and naturalness,but it still needs to be strengthened to further improve the quality of the sense of hearing.In particular,the sudden change of sound at concatenation point needs further improvement.The main reason for the sudden change is that the traditional speech synthesis cannot realize high-level co-articulation phenomenon between units.Because the co-articulation is derived from the natural continuous motion of the human vocal organs,this thesis performs units concatenation speech synthesis from the perspective of articulatory feature.The work done is as follows:Firstly,in order to make the corpus' s units fully cover the sound variants of coarticulation in Chinese,this thesis designs and builds a corpus that satisfies the variant coverage based on the summary of co-articulation in vowels,and uses Electro magnetic articulography(EMA)data to label the articulatory feature of each unit in corpus.A high quality corpus is also the source of synthesized speech.Then,in order to obtain the articulatory parameters of input text reflecting coarticulation,this thesis constructs the Hide Markov Model(HMM),so that the articulatory parameters of the model output have the transition continuity.The articulatory parameters that fully reflect the characteristics of co-articulation are also the basis for the next step of units selection.Finally,in order to accurately select the optimal unit from corpus,this thesis designs a calculation method suitable for articulatory parameters based on the cost calculation theory.And each of the obtained optimal units is processed and concatenated by a smoothing algorithm,so that the entire speech synthesis system output transitional sound.This thesis evaluates the synthesized sound effects from both objective and subjective perspectives,and uses the context-based units concatenation speech synthesis as a comparison.The experimental results show that the units concatenation speech synthesis based on articulatory feature can make the transition of the synthesized speech at the concatenation point more natural,and can better meet the requirement of coarticulation.
Keywords/Search Tags:speech synthesis, units concatenation, co-articulation, articulatory feature
PDF Full Text Request
Related items