Font Size: a A A

Study On Chinese Speech Emotion Recognition Based On SVM

Posted on:2008-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:T LuFull Text:PDF
GTID:2178360212995303Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human-machine interface grows its importance in accordance with the development of information communication services. Emotion in speech is a topic that has received much attention during the last few years, in the context of speech synthesis as well as in automatic speech recognition. From the signal processing point of view, speech signal includes the linguistic information, speaker's tone and emotion. Emotions are traditionally classified into two main categories: primary (basic) and secondary (derived) emotions. Primary emotions, including fear, anger, joy, sadness and disgust, are generally those, which are experienced by all social mammals and have particular manifestations associated with them. Secondary emotions, such as pride, gratitude, tenderness and surprise, are variations or combinations of primary ones, and may be unique to humans.Automatic emotion recognition of human speech can be viewed as a pattern recognition problem. This paper introduces application of SVM in emotion recognition of mandarin speech. Emotions are classified into four categories in this paper according to some known studies and my own experiment. Two small-scale mandarin speech emotion databases including Speaker-Independent and Speaker-Dependent have been set up by cutting and recording. Energy, pitch and speech rate related features are extracted from speech signal. This paper analysis these emotional features involved such four basic human emotions including anger, happiness, sadness and surprise.Four SVMs that correspond to each of the four emotions were used. The ith SVM is trained with all of the training data in the ith class with positive labels, and all other training data with negative labels. In the emotion recognition process, the feature vector is simultaneously fed into all SVMs and the output from each SVM is investigated in the decision logic that selects thebest emotion; the SVM that gives the positive label is chosen, and the class of the SVM indicates the recognition result including Speaker-Independent and Speaker-Dependent. The recognition results and the analysis of recognition experiments have been reported. In the end of this paper, we summarize some problems that have not been solved and the future works in this field will be discussed.
Keywords/Search Tags:Human computer interaction, Speech signal, Emotion recognition, SVM, Pattern recognition
PDF Full Text Request
Related items