Research On Emotional Recognition In Multilingual Speech Signal

Posted on:2011-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:B Li

Full Text:PDF

GTID:2178360308457365

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Speech is an important human-specific means of emotional expression, which includes specific emotional psychological characteristics as well as semantic information. The traditional speech processing system usually focuses on the accuracy of the content of speech, and neglects the research of psychological characteristics. In recent years, with the increasing application requirement of natural human-machine interaction, psychological testing, intelligent robots and many other fields, the emotional analysis and recognition in speech signal get more and more attention, and become the new research hotspot. However, the research of emotional recognition still needs further study. The building of emotional speech database, the selection and extraction of emotional characteristic parameters, and the emotional recognition have not formed systematic theory. The research of emotional recognition is usually based on English, but Chinese get less research. Further, the emotional parameters are mainly focus on prosodic characteristics, while the research of multimode recognition, which intergrates semantic information, facial expression and physiology signal, is also paid little attention. Therefore, it can be said that speech emotional recognition is still in the preliminary stage, and more deep research is needed.In order to research the emotional recognition based on multilingual speech signals, this thesis focuses on the building of emotional speech database on the basis of multi-languages, which includes Chinese, English, Japanese, Korean and Russian, analysis of prosodic characteristic parameters, extraction of emotional characteristic parameters, speech emotional recognition and emotional recognition combined with semantic information. The main contents of this thesis are as follows:First, the emotion is devided into five categories, i.e. quiet, happiness, anger, surprise and sadness. Then, we record the emotional speech in laboratory conditions, and the multilingual emotional speech database is built up for further research.Second, the speech signals of five emotions spoken in different languages are acoustically analyzed, and prosodic characteristic parameters are extracted. After analysis of emotional speech signals and comparison of acoustic features between different emotions, the general rule of speech emotional features is concluded, i.e. the changes of different languages'parameters in the same emotion exists commonness.Third, emotional recognition experiment is carried out using two algorithm, namely Principal Component Analysis and Gaussian Mixture Model, based on multilingual emotional speech database. The two algorithms achieve 74.2% and 78.1% average recognition accuracy respectively.Fourth, on the basis of acoustic features, emotional recognition intergrated semantic information experimentized. Words with different emotional color are annotated; the semantic information of a sentence is extracted using Dynamic Time Warping which is used to recognize emotional key words. Then, the prosodic characteristics are combined with semantic information for recognize the emotion using Gaussian Mixture Model. The experimental results demonstrate that the recognition accuracy of combined characteristics makes 3 percent improvement compared with the prosodic characteristics.The main innovations of this thesis are: first, built emotional speech database based on multi-languages, extracted prosodic features and concluded the general rule of speech emotional features; second, carried out emotional recognition experiment intergrated semantic information on the basis of acoustic features, and achieved better recognition accuracy than the prosodic characteristics.

Keywords/Search Tags:

speech emotional recognition, multimode recognition, Principal Component Analysis, Gaussian Mixture Model, Dynamic Time Warping

PDF Full Text Request

Related items

1	Research And Implementation Of Multi-lingual Speech Emotion Recognition
2	Research And Implementation Of Gaussian Mixture Model-based Speech Emotion Recognition
3	Based On The Design Of Small-vocabulary Speech Recognition System And Speech Recognition
4	Emotional Speech Recognition Based On Facial Expression Analysis
5	Study On Feature Extraction In Speaker Recognition
6	Research On Speaker Confirmation Technology Based On Pronunciation Action Parameters
7	Research On Prediction Of Emotional Dimensions PAD For Speech Emotion Recognition
8	Design And Implementation Of An Improved Dynamic Time Warping-based Speech Recognition Algorithm
9	Research Of Speaker Recognition Based On The Improvement Feature Parameters
10	Research On Speech Recognition