Font Size: a A A

Speech Emotion Recognition With Nonlinear Entropy Fusion

Posted on:2022-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2518306722979469Subject:Education Technology
Abstract/Summary:PDF Full Text Request
As a nonintrusive signal,speech which is used in daily communication transmits abundant interactive information and plays an important role in interpersonal activities.Speech can not only convey text information,but also accompany the emotional state of the speaker,so the meaning expressed by the same text in different emotional phonetic state will be very different.With the rapid development of artificial intelligence,the field of emotional computing is making continuous breakthroughs in technology,and the research of speech emotion recognition is becoming increasingly in depth and widely used in various fields.Speech emotion recognition refers to the emotion classification of different speech by extracting the feature parameters related to emotion representation in speech signals and establishing the mapping relationship between feature parameters and emotion categories through model training process.In this paper,from the perspective of feature extraction,the commonly used acoustic features in speech emotion recognition are introduced.Aiming at the incompleteness of emotion feature parameter information,the nonlinear entropy is proposed to supplement emotion description information.At the same time,the accuracy of speech emotion recognition is further improved from the perspective of feature optimization.The main research work of this paper is as follows:1)Two publicly recognized acoustic feature sets,eGeMAPS and 10 IS,were used to extract feature parameters from three databases,namely EMO-DB,CASIA and IEMOCAP.Support vector machine and convolution neural network were used to train and classify emotion.The two acoustic feature sets were compared from the perspectives of running time,number of features and recognition accuracy.According to the research purpose of the experiment,eGeMAPS was selected as the benchmark acoustic feature set in this study.2)From the perspective of nonlinear entropy,the sample entropy,approximate entropy and information entropy of speech signal are calculated.Through two emotion and four emotion classification experiments,the ability of entropy feature to classify emotion is explored and the best combination of entropy feature is found.The experimental results show that the entropy feature set composed of sample Entropy,approximate Entropy and information Entropy has the best classification ability,and the fusion of entropy feature set with eGeMAPS feature set in feature layer mode can steadily improve the accuracy of speech emotion recognition.3)For the 91-dimensional mixed feature set after the fusion of Entropy feature set and eGeMAPS proposed in this study,there may still be redundant information between the features.The feature selection algorithm can reduce the feature dimension while ensuring the accuracy,which indicates the practicability of feature selection step.In the process of feature selection,the weights of different features in the emotion recognition task are sorted,and the features with higher weights in the databases of different languages are analyzed,which will provide a reference for the future research in cross corpus and other directions.
Keywords/Search Tags:speech emotion recognition, feature extraction, nonlinearity, machine learning, feature selection
PDF Full Text Request
Related items