Font Size: a A A

Research On Emotion Speech In Cross-clutural Backgrounds

Posted on:2011-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhengFull Text:PDF
GTID:2178360308965544Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Most researchers of emotion speech recognition and synthesis focus on the differences of acoustic parameters among different emotions, and then train high quality samples, at last search new recognition algorithm in order to improve the accuracy. But every technology about emotion speech recognition and synthesis is based on the human's emotion production and perception, in this paper we focus on the human's emotion perception mechanism, in order to build a new model of emotion speech recognition and synthesis.Emotion expression and perception are affected by culture backgrounds, subjects in different cultural backgrounds may use common perceptual pattern in emotional perception. In this paper we design an emotional perception experiment on cross-cultural emotion expression. Four speakers are recruited, two of them are Chinese (one male and one female), and another two speakers are Japanese (one male and one female). The subjects are categorized into four kinds according to their language background: Chinese, Chinese that learn Japanese language, Japanese and Japanese that learn Chinese language. Seven emotions are video taped: Angry, Disgust, Fear, Happy, Netural, Sad and Surprise. Vocal and Audio stimuli are made respectively, hereafter called V-only and A-only stimuli.We want to know how culture background effects emotion expression and perception by vocal or facial modalities. In this paper we have done emotion perception, emotional feature perception, and acoustic parameters analysis, at last we synthesis emotional speech by the acoustic parameters.In emotion perception experiment, we find that: (1) the results of Japanese are similar to Chinese when they perceive Chinese speakers'emotions though never learning Chinese language, the same to Chinese when they perceive Japanese speakers'emotions ;( 2) in Audio experiment, Japanese subjects that learn Chinese tend to perceive more close to Chinese subjects for Chinese speakers, and the same to Chinese subjects that learn Japanese, but in Video experiment, two kinds of Japanese perceive more close and two kinds of Chinese perceive more close, that means language learners tend to perceive more close to non native subjects for non native speakers in Audio-only experiment, so we get a conclusion that cultural backgrounds affect emotion perception in Audio modality but not in Video modality. (3)some emotions are perceived confused, such as Angry and Disgust, Fear and Sad and so on in both audio-only experiment and video-only experiment.In emotional feature perception experiment, we get six features that most relevant to each emotion, we find that: (1) there are common features that used by subjects in different culture backgrounds when they percept same emotion; (2) there are common features when different emotions are perceived, such as Angry and Disgust, Fear and Sad and so on in both audio-only experiment and video-only experiment.We get the acoustic parameters of emotion speech, and do some acoustic analysis about pitch, duration and prosody and so on, we find that: compared with Neutral speech, Angry speech has a short duration, while other emotional speeches have a longer duration; the pitch of other six emotional speeches is higher than Neutral speech. Then we get a method to synthesis emotional speech based on Neutral speech.
Keywords/Search Tags:Emotion, Emotion Perception, Acoustic analysis, Cross-culture, Emotion speech synthesis
PDF Full Text Request
Related items