Font Size: a A A

Research On Speech Emotion Recognition Based On Deep Convolution Neural Network

Posted on:2022-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:B C DongFull Text:PDF
GTID:2518306476990549Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years,the rapid development of artificial intelligence stimulates the continuous progress of science and technology.With the development of artificial intelligence technology,the research of image,voice and text in the field of artificial intelligence is also developing vigorously.The research on artificial intelligence and intelligent question answering is numerous.As an indispensable part of speech research,speech emotion recognition has gradually become the focus and direction of people's research on the basis of speech semantic recognition and other technologies.How to improve the accuracy of speech emotion recognition based on the existing technology has become the primary goal of speech artificial intelligence research.Firstly,this thesis focuses on the structural features of MFCC speech emotion features(Mel-Frequency Cepstral Coefficients)which are well performed in the current research on speech emotion recognition.Taking CASIA Chinese emotion data set as the data support,and taking the development trend of neural network to a wider network structure as the core idea,a Fractal-Bi Net model is proposed,which combines the static features and all the features of MFCC to enhance the expressive force of static features to a certain extent.Through the related experiments of this model,it can be confirmed that the static features of MFCC speech emotion features have a greater contribution to the recognition accuracy on the data set in this article,and the Fractal-Bi Net model has improved recognition accuracy compared with the benchmark model using the same infrastructure.Increased by 3.334% and 1.667%.In addition,another development trend of neural network is deeper network structure.Furthermore,This thesis combined with previous related emotional research by comparing the structure and depth of Res Net(deep residual network)and Dense Net(dense convolution network),it is considered that MFCC speech emotional features are similar to other features and have certain emotional emphasis in different convolution channels.Then,Cef-Dense Net based on Dense Net network network is proposed,which is verified by experiments.Experiments show that the network has improved the recognition accuracy of MFCC features of the selected data set by 5%compared with the benchmark model.In this thesis,convolutional neural network is used as the model infrastructure,and the related networks are improved through the neural network to two development directions of wider and deeper.Compared with the benchmark model,the recognition rate of speech emotion has been improved to some extent,which plays a positive role in the research of speech emotion recognition.
Keywords/Search Tags:Speech emotion recognition, Mel frequency cepstral coefficient, Deep learning, Convolution neural network
PDF Full Text Request
Related items