Research On Cross-corpus Speech Emotion Recognition Based On Target Adaptation

Posted on:2021-11-02

Degree:Master

Type:Thesis

Country:China

Candidate:X Z Chen

Full Text:PDF

GTID:2518306452477464

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Speech emotion recognition,as a research direction of pattern recognition,signal processing and other fields,its application in human-computer interaction has gradually become a hot topic.The so-called SER,that is,using certain algorithms to carry out emotional categories of speech signals,such as dividing speech signals into"happy","sad","disgust"and other emotional categories.In the research history of speech emotion recognition,there have been a lot of effective methods to solve many problems in speech emotion recognition,but most methods use a single database to study.However,in the practical application field of emotion recognition,the data distribution between the training corpus（source domain）and the test corpus（target domain）is very different due to the different data collection environment and equipment.In this paper,cross-corpus speech emotion recognition is studied,that is,training set and test set come from different databases.In addition,the research of cross-corpus speech emotion recognition mainly includes two aspects:feature and classifier model.The main work of this paper is as follows:（1）A Target-adapted Subspace Learning model is proposed for cross-corpus speech emotion recognition.This method projects speech features into label space by finding a projection space,so as to establish the relationship between source domain and target domain,and more effectively reduce the difference of feature distribution.In order to obtain a more efficient projection matrix,l₁ andl₂ norms are used as the regular terms of this model.Finally,the IS09 and IS10 feature sets in the INTERSPEECH emotional challenge are extracted,and the model is validated in three public databases（Emo DB,e NTERFACE and AFEW4.0）.Then compare the proposed model with the existing cross-corpus methods,the results show that the proposed method is valid,and IS09 feature set is better than IS10 feature set in this experiment.（2）Using the Long Short Term Memory Network of deep learning to work the Target-adapted subspace learning model.The Target-adapted Subspace Learning model proposed in this paper has been proved to be superior by experiments.In the study of cross-corpus,more effective features are also the research focus.In this paper,the IS09 features extracted from traditional machine learning are further taken as network input,and the corresponding loss function is designed according to the proposed model,and then use Long Short Term Memory Network to classify the classes of samples.Finally,the model is compared with other domain adaptation models.The experimental results show that the model can achieve better results.

Keywords/Search Tags:

Cross-corpus Speech Emotion Recognition, Transfer Learning, Subspace Learning, Target-adapted, Long Short Term Memory Network

PDF Full Text Request

Related items

1	Reasearch On Cross-corpus Speech Emotion Recognition Based On Progressive Distribution Adaption And Emotion Discriminability Ehancement
2	Cross-Corpus Speech Emotion Recognition Based On Subspace Learning
3	Research On Cross-corpus Speech Emotion Recognition Technology Based On Transfer Learning
4	Speech Emotion Recognition Based On Transfer Regression And Subspace Learning
5	Research On Transfer Subspace Learning For Speech Emotion Recognition
6	Research On Speech Emotion Recognition Based On Spatiotemporal Feature Fusion
7	Speech Emotion Recognition Based On Deep Learning Technology
8	Multimodal Emotion Recognition From Speech And Text
9	Research And Application Of Speech Emotion Recognition Algorithm Based On Deep Learning
10	A Dual-Subspace Transfer Learning Framework For Coss-Corpus Emotion Recognition