Font Size: a A A

Research On Speech Emotion Recognition Based On Feature Selection And Optimization

Posted on:2020-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:H H LiFull Text:PDF
GTID:2428330602951059Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence,the intelligence of machines remains to be further studied.How to make the machines possess the emotions and thoughts similar to human beings is an important challenge for the field of human-computer interaction.Emotion recognition of speech is a significant branch of emotion recognition domain,which exhibits a great potential value both for scientific research and commercial applications.Combined with the extensive needs of speech emotion recognition,the speech emotion recognition algorithm based on feature selection and optimization is studied.The specific contents are as follows:The features correlated with the emotions of the original speech are extracted.Included four prosody features: speech rate,pitch frequency,short-term average energy and short-term average zero-crossing rate,the sound quality feature: formant frequency,and the spectral correlated feature: mel frequency cepstral coefficient(MFCC).The statistical parameters of the above features are obtained.The speech emotional feature selection algorithm based on random forest is studied.According to the extracted speech emotional feature characteristics,the recognition accuracy of different emotions with several features is studied.The optimal features of four emotional speeches are obtained by the feature selection algorithm based on random forest.When emotions are combined in pairs,the best features for each combination of emotions are obtained by the feature selection algorithm mentioned above.The speech emotion recognition algorithm based on convolutional neural(CNN)network is studied.Firstly,the input feature is obtained,according to the MFCC and features selected by the random forest.Then,the model of CNN is constructed on the basis of characteristic size.In this way,the CNN is trained and then applied to recognize the emotion of different emotional utterances.The experimental results show that the proposed algorithm can achieve better effectiveness than those algorithms only taking advantage of MFCC.Specifically speaking,the recognition rate of the four different emotions can be increased by 3.68%.Moreover,when the four emotions above mentioned is combined in pairs,the accuracy achieves 78.80% to 96.87%.In conclusion,the proposed algorithm is superior to the conventional MFCC-based method.The speech emotion recognition algorithm based on long short time memory network(LSTM)is studied.Firstly,the input feature is obtained,according to the MFCC and features selected by the random forest.Then,the model of LSTM is constructed.In this way,the LSTM is trained and then applied to recognize the emotion of different emotional utterances.The experimental results show that the proposed algorithm can achieve better effectiveness than those algorithms only taking advantage of MFCC.Specifically speaking,the recognition rate of the four different emotions can be increased by 1.14%.Moreover,when the four emotions above mentioned is combined in pairs,the accuracy achieves 60.38% to 87.14%.In conclusion,the proposed algorithm is superior to the conventional MFCC-based method.The research results of this thesis can be applied to many aspects such as human-computer interaction,medical diagnosis and criminal investigation.
Keywords/Search Tags:speech emotion, feature extraction, random forest, convolutional neural network, long short time memory network
PDF Full Text Request
Related items