Font Size: a A A

Research On Speech Emotion Analysis Algorithm Based On PAD Emotion 3D Model

Posted on:2021-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:B W CuiFull Text:PDF
GTID:2518306041461754Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence technology in the computer field in recent years,emotional computing as an advanced stage of artificial intelligence development has gradually become an important branch in this field.If you consider improving the intimacy and accuracy of human-computer interaction and truly making the interaction natural and harmonious,it is the key to accurately identify and analyze human emotions.A large number of domestic and abroad researchers have turned their attention to the direction of dimensional emotion research in the field of emotion computing as human emotions are complex,subtle,and continuous.In the process of human interaction,voice is the most direct communication channel.People can obviously feel others emotional changes through voices,such as changes in special tone words and intonation.This article which is based on speech emotion is from the perspective of building a dimensional speech database based on the PAD three-dimensional emotion model,and is based on traditional emotional speech recognition and adopts a cascade classification method that combines acoustic features with emotional speech PAD data.To research the SVR(Support Vector Regression)algorithm is used to construct the dimensional emotional speech PAD prediction model.At the same time,using the optimization algorithm which is combining the grid algorithm and the particle swarm optimization algorithm to optimize the model parameters.The experiment has obtained good results.The main work has the following points:(1)The idea of using the correlation between emotional speech features and PAD three-dimensional emotional model.It is not limited to studying the four basic emotions of anger,happiness,surprise,sadness,and applying the continuous dimension emotion theory to the emotional content analysis in the speech library,using the PAD(pleasure,activation,and Dominance)three-dimensional emotion model to describe emotional speech,and a correlation analysis method of emotional speech features and PAD emotional model is proposed.Four types of speech features including Mel frequency cepstral coefficients,linear prediction coefficients,prosody features,and formant frequency features in the EMO-DB emotional speech database are extracted for emotional speech recognition,and the PCA method is used to complete the emotional feature dimensionality reduction during PAD prediction.The PCA(Principal Component Analysis,PCA)algorithm is used to select features,which reduces the correlation between features.The PCA feature dimensionality reduction method is used to complete the PAD dimension prediction.Experiments show that this method improves the accuracy of PA prediction to a certain extent.(2)By analyzing the shortcomings of the standard regression model in the prediction direction of the PAD dimension,an SVR regression model using a combination of PSO(Particle Swarm Optimization,PSO)algorithm and grid algorithm is proposed.Because of the strong feature of global search PSO algorithm combined and the feature of high local search accuracy of grid algorithm,the parameters of SVR model have been optimized,at the same time,the blindness in the selection of model parameters has a reduction.By using the model to predict the horizontal comparison of PAD experimental results,it shows that the prediction effect of this model is improved compared with the original model.(3)Combining the PCA feature dimensionality reduction method with the SVR model,a PCA-PSO-SVR regression model is proposed.The model is used to predict the PAD dimension.According to the experimental results,the model has better ability to predict the PAD dimension than PCA-SVR and PSO-SVR.In order to further prove the accuracy of the regression model's prediction of the PAD dimension,three sets of emotion recognition were used for experiments and comparisons.The first group used the feature of four directly xtracted emotion MNFF(MFCC Feature,Nonlinear Feature,Fusion Feature of Prosodic Feature and Formant(MNFF)to recognize emotions,the second group uses PADs which are predicted by three types of models(PCA-SVR,PSO-SVR,PCA-PSO-SVR models)as feature to recognize emotions,and the third group is combining values predicted by the three types of models with MNFF features,and then these features are used in emotion recognition experiments.The three sets of experimental results show that the PCA-PSO-SVR model has the best emotion recognition effect when predicting the value of PAD and the fused feature of MNFF.
Keywords/Search Tags:sentiment dimension, least squares support vector machine, least squares support vector regression machine principal component analysis, particle swarm optimization algorithm, grid optimization algorithm
PDF Full Text Request
Related items