Font Size: a A A

Research On Speech Emotion Recognition Technology Based On Nonlinear Feature And Spectral Feature Extraction

Posted on:2020-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:W J YinFull Text:PDF
GTID:2518306353456954Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Speeh is one of the most important ways of communication in human daily life,which contains rich emotional information.In order to make people and machine's interaction more natural and intelligent,scholars pay attention to the content of speech emotion recognition and study it.Although in the past few decades,the research on speech emotion recognition has made great progress,there are still lack of emotional features that can accurately recognize the emotional information of speech in the existing speech emotion recognition system.Therefore,the research on speech emotion recognition feature extraction technology is still the focal point of current research,extracting representative speech emotion features,and improving the recognition accuracy of speech emotion system has high practical significance for human-computer emotion interaction.Above all,the research work what has done is as below:(1)A multi-feature fusion method based on decision-level power exponent weighting is proposed.At present,most researchers fuse emotional features with feature escading,ignoring the difference between the features.There are also a few scholars who use the method of decision-level fusion based on simple addition or multiplication rules.When the probability of the decision-making level is similar,the effect of decision fusion cannot be well played.Then,a method based on power exponential weighted fusion is proposed.The power factor function is introduced to calculate the weight coefficient,so that the weights are redistributed,so that the better performance classifier gets a larger weight,to improve the final Recognition effect.The results of simulation in CASIA speech database,Compared with the commonly used feature cascading method,the final recognition rate is improved.(2)The traditional method of emotional feature extraction assumes that speech is a short-term stationary signal,while the actual speech signal is nonlinear and non-stationary.In view of the above situation,this paper chooses EEMD algorithm to process nonlinear non-stationary signals,and extracts an IMF energy entropy(IMFE)feature based on EEMD algorithm.Emotional speech signal isdecomposed into a group of IMF by EEMD,the Spearman Rank correlationcoefficient is used to screen the effective component of IMF,and a new feature of speech emotion named IMF energy entropy(IMFE)is obtained by calculating energy entropy.The results of simulation in CASIA speech database and comparisonwith the recognition rate of prosodic features and MFCC show that IMFE caneffectively identify emotion and the recognition performance of negativeemotion is the best.(3)In view of the problems of spectral recognition-based emotion recognition rate with relatively few time-frequency combinations,a method based on Gabor gray image spectrum improved complete local binary pattern(GGCLBP)feature extraction method is proposed.Extracting the spectral grayscale image of the speech emotion signal,Gabor transform is used to enlarge the local texture information of the spectral grayscale image to obtain the Gabor grayscale image spectrum.,Then the texture feature information of Gabor gray image spectrum is extracted by the method of improving the complete local binary mode to form the GGCLBP feature.The results of simulation in CASIA speech database and comparison with the recognition rate of traditional acoustic characteristics show that GGCLBP has higher recognition effect and better fusion than traditional acoustic features.
Keywords/Search Tags:Emotion recognition, power exponential weighting, IMFE, Gabor grayscale image spectrum, improved complete local binary mode
PDF Full Text Request
Related items