Research On Active Learning And Semi-supervised Learning Methods In Speech Emotion Recognition

Posted on:2014-10-26

Degree:Master

Type:Thesis

Country:China

Candidate:X P Zhang

Full Text:PDF

GTID:2268330422951687

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Speech emotion recognition, especially large-scale emotion recognitionrequires a lot of emotional speech annotation. Numerous unlabeled naturalspeeches can be easily collected, while the annotation is high cost, especially forthe dimensional labeling. This paper studied active learning and semi-supervisedlearning in both discrete emotion model and dimensional emotion model. Activelearning focuses on how to choose the most informative data for humanannotation, while the semi-supervised learning focuses on how to label emotionalspeech automatically by itself. The goal is to achieve high performance at lowannotation cost.Aiming at how to choose the most informative emotional speech for humanannotation, thus effectively reducing the annotation costs, we studied activelearning method in speech emotion recognition. Under discrete emotion model,considering that the output probability is a measure of classification certainty, westudied least confidence, margin sampling and information entropy methodsbased on output probability. Moreover, we studied query by committee methodby constructing a committee of models to represent different regions of versionspace. Vote entropy and K-L divergence were applied to measure thedisagreement. Larger disagreement indicated more information. Underdimensional emotion model, output probabilities cannot be directly estimated inregression problems. In this paper, we studied least confidence method thatdiscretized the continuous annotation, and then approximately evaluated theinformativeness using a discrete classification model. Similarly, we also studiedquery by committee method by constructing a committee of regression model andestimated speech informativeness using output variance.Aiming at how to combine unlabeled data to improve learning performance,we studied semi-supervised learning method in speech emotion recognition.Under the discrete emotion model, considering that the high output probabilityspeech is the most certainty, we explored the self-learning method based onconfidence thresholds and the co-training method using different feature subset.Moreover, we used graph Laplacian to represent the similarity between two emotion data and propagate the label to surrounding vertex based on theirsimilarity, namely label propagation algorithm. Under dimensional emotionmodel, we explored the graph regularized method in speech emotion recognitionbased on manifold assumption. Considering squared loss of labeled data andmodel complexity norm in Hilbert space, we studied the LapRLS which usedLaplacian matrix defined on the manifold structure as a regularization term, andthe CoRLS which used two prediction functions’ disagreement based on twodifferent views as a regularization term.Finally, we studied the combination of active learning and semi-supervisedlearning methods under the discrete emotion model. The combination methodaugmented the training set by selecting least confidence speech for humanannotation and highest confidence speech annotated by the model.

Keywords/Search Tags:

Speech Emotion Recognition, Active Learning, Semi-supervisedlearning, dimensional emotion, graph regularized methods

PDF Full Text Request

Related items

1	Speech Emotion Recognition Based On A Semi-supervised Learning Research
2	Research On Key Techniques Of Speech Emotion Recognition
3	Research On Key Technologies Of Speech Emotion Recognition
4	Research An Speech Dimensional Emotion Recognition Method In Social Media
5	A Study On Recognition Of Emotions In Speech
6	Research On Speech Emotion Recognition Based On Deep Learning
7	Research And Implementation Of Emotion Monitoring System Based On Speech Emotion Recognition
8	Research On Emotion Recognition Of Monomodal Speech And Multimodal Speech Vision Based On Transfer Learning
9	Research On Feature Fusion Method Of Speech Emotion Recognition Based On Deep Learning
10	Bimodal Emotion Recognition Based On Facial Expression And Speech