Font Size: a A A

Research And Application Of Speech Synthesis Method Integrating Emotional Expressiveness

Posted on:2022-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2518306524993749Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology,text-to-speech(TTS)has become an indispensable part of human-computer interaction,and it plays an important role in human-computer interaction.At present,the research on Chinese emotional speech synthesis is still in its infancy.Most of the research is on Chinese speech synthesis,lacking the addition of emotion.This thesis will study Chinese emotional speech synthesis based on recurrent neural network.The main research contents are as follows:1.An end-to-end Chinese speech synthesis method based on recurrent neural network is proposed.This method solves the method of synthesizing Chinese speech through the vocoder using the Mel spectrogram generated by the model training under the condition of limited resources.In the experiment,the model is trained using the "standard shell data set",and the final result is compared with the real voice,and the average opinion score is 4.1,which is close to the real voice score.2.An emotional speech synthesis method based on variational automatic coding is proposed.This method uses variational automatic coding technology to learn emotions and generate emotional features when there are few Chinese emotion data sets,and then combines with speech synthesis technology to generate emotional speech.In the experiment,using a small sample of emotional data "CASIA data set",the variational automatic coding is trained to obtain emotional features,combined with the previous speech synthesis,the final synthesized emotional speech is compared with the real emotional speech,and the average opinion score is 4.0,not much different from real voice.3.Designed and implemented Chinese emotional speech synthesis to be applied to service robots.The application provides a voice reading function.The user can input text to the service robot on the browser interface.The robot will perform emotional voice reading according to the user’s text and emotion button selection.The browser interface is concise and clear,which satisfies the user’s ease of operation.
Keywords/Search Tags:Speech Synthesis, Emotional Speech Synthesis, Variational Automatic Coding, Recurrent Neural Network
PDF Full Text Request
Related items