Font Size: a A A

Neural Network-based Chinese Speech Emotion Recognition

Posted on:2005-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2208360122470022Subject:Software and theory
Abstract/Summary:PDF Full Text Request
Along with rapid development of Human Computer Interaction system, emotion in speech is a topic that has received much attention during the last few years, in the context of speech synthesis as well as in automatic speech recognition. Speech is the most convenient means of communication between people and it is one of the fundamental methods of conveying emotion, on a par with facial expression. Moreover, emotion plays an important role in communication. It is not difficult to imagine that if an individual loses both their ability to speak and their means of expression their emotions, either vocally or even physically, due to paralysis, his or her life would be very isolated and depressing.From the signal processing point of view, speech signal includes the linguistic information, speaker's tone and emotion. Emotions are traditionally classified into two main categories: primary (basic) and secondary (derived) emotions. Primary emotions, including fear, anger, joy, sadness and disgust, are generally those, which are experienced by all social mammals and have particular manifestations associated with them. Secondary emotions, such as pride, gratitude, tenderness and surprise, are variations of combinations of primary ones, and may be unique to humans.Automatic emotion recognition of human speech can be viewed as a pattern recognition problem. This paper introduces application of neural network and principal component analysis in emotion recognition of mandarin speech. We have recorded a mandarin emotional speech database containing speaker-independent and speaker-dependent emotional speech. The energy, pitch and speech rate related features are extracted from speech signal. This paper analysis these emotional features involved such four basic human emotions including anger, happiness, sadness and fear.The dimension of the input feature vector is large, but the components of the vectors are highly correlated (redundant). We preprocess the neural network input training set by applying a principal component analysis. This analysis transforms the input data so that the elements of the input vector set will be uncorrelated. In addition, the size of the input vectors may be reduced. Three types of neural network including OCON, ACON and LVQ are used to recognize the four basic human emotions. Emotion recognition of mandarin speech based on relative emotional feature is introduced. The recognition results and the analysis of recognition experiments havebeen reported. In the end of this paper, we summarize some problems that have not been solved and the future works in this field will be discussed.
Keywords/Search Tags:Human Computer Interaction, Emotion Recognition, Speech Signal, Mandarin Emotional Speech Database, Principal Component Analysis, Neural Network, Relative Emotion Feature
PDF Full Text Request
Related items