With the rapid development of computer network communication and multi-media technique, new human machine interaction (HCI) technology has been the significant research directions in computer application fields and emotional speech recognition is of great importance to the implementation of human machine natural interaction. Based on the development of psychology, physiology, neuroscience, computer technology and pattern recognition, speech emotion recognition has had great progress both in theoretical research and particular application. There are pilot studies in the speech emotion recognition frameworks. However, with the demand of high performance in speech emotion recognition, the existent technologies and methods could not meet the request. Because of the lack of research in mandarin speech emotion recognition we should work hard even more to supply the gap.Based on the background of natural HCI applications, this paper analyzed the main problems of current speech emotion recognition and researched several key technique in this fields as follows:(1) Establishment of mandarin emotional speech database. In this paper, we proposed a new recording method which utilizes text information to induce emotion. Applying studio recording and movie clipping, the database obtain a total of 1350 emotional speech sentences which consist 5 categories including anger, fear, happy, neutral and sad.(2) Analysis of acoustic features and features selection. This paper analyzed the statistic characteristic of prosodic and formant features when the speech under different emotion states and the dynamic characteristic of prosodic features while the emotion state changed. Based on the above, this paper proposed two phrases features selection method to select the small inter-class distance and large extern-class distant to enhance recognition accuracy.(3) Confusion of statistic features and time-sequential features. This paper brought forward a new method which adopt improved LBG algorithm to confuse statistic features and time-sequential features. By the experiment, we found the best feature vectors configuration in different emotion recognition application.(4) Speech emotion recognition system based on artificial neural network. This chapter presented an experimental prototype system called ERSNN, which adopt neural network and synthesize all techniques mentioned in this paper.This paper advanced a new idea and effective solutions for the establishment of emotional speech database, selection and confusion of emotion and implementation of emotion recognition system, which provides a feasible reference for people to research emotion recognition mechanism further. |