Research And Application Of Speech Emotion Recognition

Posted on:2010-11-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Liu

Full Text:PDF

GTID:1118360302958555

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of human-computer interaction technology, the research of human-computer interface has gradually entered the era of multimedia interface from the era of mechanization. As one of the key technologies in intelligent human-computer interaction, speech emotion analysis and recognition has been a hot spot. Researchers from various fields concerned about how to make the computer automatically to recognize speakers' emotional states from speech signals, and respond more targetedly and more humanly.The research significance of speech emotion recognition and the main research content of this paper are summarized firstly. Then we recall some key issues in the current studies of speech emotion, including the kinds of emotional states, the overview of emotional corpus, acoustic features of speech signals, feature dimensionality reduction, classification algorithm, and semi-supervised learning based speech emotion classification.This paper presents several models of feature selection and feature extraction. The speech emotion recognition based on a fusion of all-class and pairwise-class feature selection is a new type of model structure. It focus on the discrimination between every two emotional states, and simultaneously take the overall distribution of samples into account, so the all-class feature selection and the pairwise-class feature selection are both involved. The model structure is suitable to many classification algorithms and it can effectively improve the performance of recognition system. Feature selection based on feature projection matrix uses the projection matrix from feature extraction to evaluate the importances of initial acoustic features, and then complete feature subset selection based on the importances. The experimental results show that, compared to the feature extraction method which simply uses the projection matrix to do data mapping, this feature selection algorithm has more advantages. Through the analysis of the data, a hierarchical framework of feature extraction for speech emotion recognition selects a variety of dimensionality reduction algorithm to process different gender or different emotional states of corpus. This idea can be extended to other corpus, by constructing a suitable recognition system based on hierarchical dimensionality reductio, and it will improve recognition performance. Enhanced Lipschitz embedding algorithm based on manifold learning is a nonlinear dimensionality reduction algorithm. Through the calculation of geodesic distance, the high-dimensional feature vectors are mapped into a low-dimensional subspace. The algorithm improves the recognition accuracy dramatically in speaker-dependent and speaker-independent speech emotion recognition under controlled laboratory environment, as well as in speaker-dependent speech emotion recognition under the environment of Gaussian white noise and sinusoidal noise.In the traditional system of speech emotion recognition, each acoustic feature is regarded as one component of a simply composed feature vector which is the input of classifiers. Speech emotion recognition based on covariance descriptor and Riemannian manifold considers the the correlation between different acoustic features. The experimental results show that the correlation could reflect the emotional information, and the recognition system established on the correlation has high stability and anti-noise ability.On a small number of labeled samples and a large number of unlabeled samples, this paper presents an enhanced co-training algorithm to build a classification model based on semi-supervised learning. It introduces a restriction on label predictors to improve the standard co-training algorithm. This algorithm reduces the production of classification noises and improves the performance of classifiers.Considering the practicality of the researchs on speech emotion, this paper proposes a classification model of AdaBoost+C4.5 to analyze the emotional states of real-time speech signals. We realize a complete real-time emotion recognition model and apply it in a real-time facial animation system driven by emotional speech.

Keywords/Search Tags:

Speech emotion recognition, all-class and pairwise-class feature selection, feature selection based on feature projection matrix, hierachical feature extraction, enhanced Lipschitz embedding algorithm, covariance descriptor and Riemannian manifold

PDF Full Text Request

Related items

1	Speech Emotion Recognition With Class-dependent Feature Selection Methods
2	Speech Emotion Recognition Research Based On Feature Selection
3	Research On Key Techniques Of Speech Emotion Recognition
4	Research And Implementation Of Speech Emotion Recognition Based On Feature Selection And Confusion
5	Research On Feature Extraction And Feature Selection
6	Research On Feature Selection And Construction In Emotion Speech Recognition
7	Research On Feature Extraction And Classification Of Speech Emotion Recognition
8	Robust Hierarchical Feature Reduction Based On The Class Relation
9	Rearch On Speech Emotion Recognition Technology Based On Feature Selection And The Decision Tree SVM
10	Speech Emotion Recognition With Nonlinear Entropy Fusion