Font Size: a A A

Research On Emotional Speech Based On PAD Three-Dimensional Emotion Model

Posted on:2019-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2348330569479536Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence technology,emotional computing has become an important branch of this field.If we want to achieve the natural and harmonious human-machine interaction,realizing the real intelligence and accurately understanding human emotion is the essential technical key.Because of human emotion is complex,subtle and continuous,the research direction of dimensional emotion has been widely paid attention to by many researchers at home and abroad in the field of affective calculation.As the most direct and important way of communication,speech carries a variety of basic text information while including emotional information.This paper based on study of emotional speech,starting from the perspective of constructing a dimensional speech database based on the PAD three-dimensional emotion model.On the basis of traditional emotional speech recognition,a cascade classification method combining acoustic features with emotional speech PAD data is proposed.The support vector regression SVR algorithm is applied to construct the PAD prediction model of dimensional emotional speech,and the experiments have achieved good results.The main innovations and workof this paper are as follows:(1)The emotional voice database was screened and the intensity of each sentence was annotated at the same time.Combining PAD three-dimensional emotion model with SAM model.An improved simplified version of the PAD emotion scale for labeling experiments was designed to perform PAD labeling experiments on emotional speech databases.(2)In order to verify the validity of the PAD data obtained from the labeling experiment,we designed experiments to analyze the mean and standard deviation of PAD data by mathematical statistics.Analyzing the distribution of PAD data in three-dimensional emotional space,and found out four kinds of emotional data center coordinates in three-dimensional space.The dimensional speech database constructed in this paper was reasonable and effective.(3)Based on the traditional method of emotion speech recognition based on support vector machine(SVM),this paper proposed a cascade classification method which combined acoustic features with emotional speech PAD data.Firstly,the prosodic features and MFCC features of emotional speech were extracted,designed contrast experiments to find the best combination of acoustic features.According to the value of PAD,it was found that the score of pleasure was obviously different,which can effectively distinguish the confusable emotion.Therefore,combining the combination of acoustic features with the level of pleasure,the method of cascading classification can greatly improve the recognition rate.(4)A dimensional emotional speech PAD prediction model based on support vector regression machine(SVR)was proposed.The SVR regression Prediction Model for Experimental training was designed according to the principle of best mean square error and square correlation coefficient.The best radial basis function was obtained through comparative experiments,and the PAD data were predicted by using the radial basis function.The experimental results showed that the prediction model based on SVR has good prediction effect on PAD data,and the prediction accuracy of activation degree A was better than that of pleasure P and dominance D.
Keywords/Search Tags:emotional speech recognition, PAD three-dimensional emotion model, dimensional emotional speech database, Cascade classification, support vector regression machine SVR, Regression prediction model
PDF Full Text Request
Related items