Font Size: a A A

Continue Emotional Computing Based On Envelop Spectral Modulation Pattern For Chinese Speech

Posted on:2013-02-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Q QinFull Text:PDF
GTID:1118330371990760Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Speech continue emotion computing is a current area of research with wide variety of applications in intelligent human-machine interaction systems. Although many researchers have investigated the possibility of speech discrete emotion recognition by speech and achieved some results, there exists no satisfactory solution of speech continue emotion yet. The goal of this thesis is continue emotion computing of speech in Chinese, which is to determine the continue emotional states of a particular speaker from the uttered speech samples.This paper presents distributional relation between auditory-psychological-based a computable model of continue emotion--nvelope spectral modulation pattern(ESMP) and emotional psychological dimensional (valence,arousal,dominance and power) for the automatic recognition of human continue affective information from speech. The ESMP is extracted from an auditory-inspired long-term spectro-temporal representation, and they include both spectral frequency and temporal modulation frequency components, thereby conveying emotional information that is not conventional acoustic features but perception spectral features of human speech. This thesis focuses on the continue emotion computing by Chinese speech, the primary content include the following:â‘ The build of fuzzy continue emotional speech database;â‘¡The perceptual listening test:dimensional analysis of speech emotion;â‘¢The computer test:the envelop spectral feature extraction, the spectral computing and the fuzzy emotional classification.The Mandarin fuzzy continue emotional speech collection:based on the analysis of some international emotional speech database, we decided the subjects, the speakers, and the kind of the speech(natural, simulated, elicited), the kind of the fuzzy emotions and the numbers of speech. The emotional states for study are limited to the5fuzzy basic emotional behaviors:(some, half and very) joy, anger, surprise, sadness and fear,1fuzzy second derived emotion:(some, half and very) surprise-joy, and supplemented by a reference speech which indicate no-emotion state. After the first perceptual listening test for the collected fuzzy emotional speech data, the Mandarin fuzzy emotional speech database is built up for further research.The second and third perceptual listening test:The distributions of these emotions in valence-arousal-dominance(V-A-D) space were studied. Each of the3-dimensions was represented by7gradations, then normally-hearing listeners were asked to listen to these emotional utterances that were selected in the first listening test, and to rate each utterance on7gradations on the3dimensions of V-A-D. The distributed result of each emotion in V-A-D space can be obtained.The computer test:firstly, the envelope features(up and down envelope, envelope spectrum and envelope feature vectors) of emotional speech are spectrally analyzed in relation to reference (no-emotion) speech. Then these features are extracted by using the ensemble empirical mode decomposition (EEMD) piecewise power function(PPF) algorithm. Emotional intrinsic mode functions(IMFe) are obtained by EEMD on emotional speech signals, the Mel frequency cepstrum coefficient of each IMFe is extracted as the emotional feature coefficient which is used in speaker emotional identification applying by vector quantization. Envelope and envelope spectrum can be obtained by transforming the IMFe, envelope feature vectors can also be obtained by fast Fourier transform(FFT) algorithm.Based on envelope features, this paper further studies power spectral density and power spectrum for Chinese emotional speech, the ESMP can be obtained. MATLAB is used to calculate the characteristic of EEMD and envelop spectral for every fuzzy emotion, and obtain ESMP of these fuzzy emotions. At the same time, the paper further analyses the relationship between the dimensional ratings and the ESMP in terms of peak value(PV), instant at which peak occurs(IP), centroid(C), equivalent width(EW) and mean square abscissa(MSA) of ESMP in the4dimensional space of V-A-D-P.A relatively novel cross-correlation algorithm based on ESMP extractor and fuzzy support vector regression(FSVR) classifier for Chinese speech fuzzy emotional classification is proposed. The proposed technique has been utilized for classification of fuzzy emotions ((some, half and very)joy, surprise and surprise-joy) for Chinese speech. The FSVR classifier employs fuzzy cascade bisection(FCB) process, and is suitable for envelop spectral features of cross-correlation of emotional speech signals. This envelop spectral cross-correlation algorithm aided FSVR classifier could considerable improve Chinese speech fuzzy emotional recognition rate, and detects very joy emotion efficiently with a recognition rate of92.58%.All in all, taking into account the perceptual listening test and the computer test, the paper may safely lead to a conclusion:the result of listening test and computer test is consistent, and using ESMP could considerable improve Chinese speech fuzzy emotional recognition rate. As a new attempt, this dissertation proposed a novel feature(ESMP) and two new algorithms(EEMD and FSVR) which have certain theoretical basis and the practical effect. It is good for the future study of speech continue emotional computing and human-machine speech emotional interaction.
Keywords/Search Tags:Continue Emotion Computing, envelope spectral modulationpatterns(ESMP), ensemble empirical mode decomposition (EEMD), fuzzysupport vector regression(FSVR), cross-correlation algorithm, fuzzy cascadebisection(FCB)
PDF Full Text Request
Related items