Font Size: a A A

Research Of Feature Extraction And Recognition For Speech Emotion

Posted on:2014-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:L XiangFull Text:PDF
GTID:2248330398994636Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Emotional information in speech plays an important role in communication of human dailylife and production. As an important index of the human-computer interface intelligence, theanalysis and recognition of emotional speech is one of the key issues in artificial intelligencerealization, and has been attracted more and more attention from researchers. Moreover, speechemotion recognition has applied in many fields such as e distance education, criminalinvestigation, medical science, entertainment and service industry. At present, research on speechemotion, restricted by development of emotional theory, complexity of language and relateddisciplines, has many limitations. Therefore, research on speech emotion recognition hasimportant theoretical significance and practical value.Based on text independent emotional speech database, this paper studied emotional featuresextraction and recognition. The main contents of this thesis are listed as follows:(1) Several kinds of influential emotional speech database has been introduced, this paperstudies the constitution method of emotional speech database, and establishes an emotionalspeech database which is included800sentences with four emotions: happiness, anger, fear andneuter.(2) Signal analysis method with Hilbert-Huang transformation is studied. Empirical modedecomposition(EMD) separates speech emotion signal into intrinsic mode functions(IMF).Hilbert spectrum, obtained from Hilbert transformation of IMF, can show the time frequencydistribution of signal better. Ensemble empirical mode decomposition(EEMD) is improvedalgorithm of EMD, has anti-aliasing properties through the comparison and analysis.(3) This paper describes properties and extraction method of speech emotion features, suchas pitch, formant frequency, linear prediction cepstrum coefficient (LPCC) and Mel frequencycepstrum coefficient (MFCC). EEMD and Hilbert marginal spectrum is introduced into theprocessing of nonlinear and non-stationary speech signal. Based on masking effect and Hilbertmarginal spectrum, this paper presents emotional marginal spectrum, which has moreconcentrated critical frequency distribution, characterizes emotional information effectively. (4) This paper expounds normal classification method of speech emotion recognition,presents a method based on multi-strategy and LibSVM. According to discrete emotion modeland, this method uses emotion features hierarchically. It turns out that method based onmulti-stage and support vector machine can improve emotion recognition accuracy.
Keywords/Search Tags:Speech emotion recognition, Feature extraction, Ensemble empirical modedecomposition, Support vector machine, Multi-strage
PDF Full Text Request
Related items