Font Size: a A A

Research On Speech Emotion Recognition Based On Convolutional Recurrent Neural Network

Posted on:2022-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:X T HuFull Text:PDF
GTID:2518306605472074Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Last few years,with the rapid development of artificial intelligence,people's demand for machines is also increasing.They expect machines to have emotions just like humans.Speech,as an import means of communication between people,not only conveys semantic information,but also express rich information of emotional state.Therefore,the study of emotion in speech is of great significance and effectively help humans and machines to achieve more natural and smooth interactions.In this thesis,speech emotion recognition based on the convolutional recurrent neural network is researched,which mainly includes the following three aspects:For the emotional speech database,emotion-related features are extracted from the speech signal,including prosodic features,voice quality features and spectral-related features,then statistical functions are applied to these features to obtain their statistic features.Finally,all the features are combined in the feature dimension to get a two-dimensional feature matrix,which is used for model training.Taking the extracted features as the input of the model,the recognition of speech emotion based on the attention mechanism of recurrent neural network is studied.Firstly,the impact of the sample length of training set on speech emotion recognition rate is discussed,it is found that the model has the highest recognition accuracy when the sample length of training set is 180 frames.Then,with 180 frames as the training sample length,the speech emotion recognition is carried out on the recurrent neural network model and the recurrent neural network model with attention mechanism.Experiments have shown that the performance of the model is improved by 6.51% when using the attention mechanism-based recurrent neural network to recognize the four emotions in this thesis.In two kinds of emotion recognition,the recognition accuracy of the model is also significantly improved.Therefore,the feature combination and attention mechanism model proposed in this thesis has significant effect in speech emotion recognition.An Integrated network model is proposed and used for speech emotion recognition,which combines the structural characteristics of convolutional neural network and recurrent neural network.Firstly,a comprehensive network structure called convolutional recurrent neural network is designed based on the characteristics of convolutional neural network and recurrent neural network.Then,the network model is trained by using the features extracted in this thesis as input.At the same time,the influence of the length of the training sample on the accuracy of speech emotion recognition is discussed,it is found that the model's emotion recognition accuracy reaches the highest when the training set sample length is 140 frames.Finally,140 frames are used as the sample length of the training set and the convolutional recurrent neural network is used as the model to recognize speech emotion.The experimental results show that the recognition accuracy of the model reaches 85.71% when the convolutional recurrent neural network model with 64 dimensional features as input is used,while the recognition accuracy of the convolutional neural network model using 46-dimensional features as input is 77.00%.In contrast,the accuracy of emotion recognition is increased by 8.71%;Moreover,the recognition rate of two kinds of emotion is also significantly improved compared with convolution neural network alone.In conclusion,the feature combination and convolutional cycle neural network model proposed in this thesis have significant effect in speech emotion recognition.
Keywords/Search Tags:speech emotion recognition, recurrent neural network, attention mechanism, long and short-term memory network, convolutional recurrent neural network
PDF Full Text Request
Related items