Font Size: a A A

Research On Speech Emotion Recognition Method For Chinese Language

Posted on:2020-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:F L ChenFull Text:PDF
GTID:2428330590471785Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Speech is one of the main ways for humankind to communicate with each other and express emotions.Because the voice contains a lot of information,the research of speech plays a vital role in the field of human-computer interaction.At present,speech recognition which translates speech into text information has received widespread attention from scholars.It has been commercialized,such as "Flytek Input".Speech emotion recognition also has been widely studied,while it is not widely used.With the rapid development of virtual reality and augmented reality technology,the traditional human-computer interaction can no longer meet the public's increasing demand for more natural and convenient human-computer interaction.Combining speech emotion recognition with virtual reality not only reflects the practicability of speech emotion,but also enhances the convenience and naturalness of virtual environment interaction.According to the differences between Chinese speech signal and the others,the thesis studies the method of Chinese speech emotion,tests the relevant algorithm of speech emotion recognition,as well as proposes an improved model for its training recognition model.The speech emotion state can be visualized in the virtual environment by designing the body movement.Design and implement the application of natural interaction system in virtual environment based on speech emotion recognition.The main research contents of this thesis are as follows:1.Studying on the traditional machine learning algorithm for Chinese speech signal,a feature dataset for Chinese speech emotion recognition is proposed.According to the characteristics of Chinese speech signals,the emotional characteristics of Chinese speech are explored.The validity and feasibility of the feature are confirmed by the corpus in Chinese speech emotion database and the reference literature.Describing the corpus in the speech library for the characteristics of Chinese speech signals by using MFCC features,ZCR features and short-term energy features.And the traditional machine learning Algorithm SVM is used for recognition and classification.2.According to the characteristics of Chinese phonetic emotion features,the recognition classification algorithm model is studied.Firstly,this thesis adopts one-dimensional convolutional neural network to learn the features of emotional features.After that,input them into Softmax classifier to identify the emotional state contained in the voice signal and output the recognition results.Secondly,the emotional features extracted by the short and long time memory neural network are used for feature learning,and the emotional states contained in the speech signals are recognized,and the recognition results are output.Finally,comparing and analyzing the experimental results,this thesis also proposes a ConvLSTM learning network that integrates global features and local features,and then inputs the learned speech emotion features into Softmax classifier to recognize the speech emotion state and output the recognition results.The comparative analysis of the experimental results proves the superiority of the ConvLSTM model.3.The validity and feasibility of the method is proposed in this thesis.Firstly,design and construct a virtual environment interaction system based on speech emotion recognition.It realizes the visualization of speech emotion in virtual environment through body movement.Thereafter,the virtual environment interaction system based on speech emotion recognition was tested by completing the emotional interaction between the autonomous virtual person and the virtual avatar.
Keywords/Search Tags:CNN, LSTM, Speech Emotion, Virtual Environment Interaction
PDF Full Text Request
Related items