Font Size: a A A

Research On Cross-corpus Speech Emotion Recognition Based On Target Adaptation

Posted on:2021-11-02Degree:MasterType:Thesis
Country:ChinaCandidate:X Z ChenFull Text:PDF
GTID:2518306452477464Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Speech emotion recognition,as a research direction of pattern recognition,signal processing and other fields,its application in human-computer interaction has gradually become a hot topic.The so-called SER,that is,using certain algorithms to carry out emotional categories of speech signals,such as dividing speech signals into"happy","sad","disgust"and other emotional categories.In the research history of speech emotion recognition,there have been a lot of effective methods to solve many problems in speech emotion recognition,but most methods use a single database to study.However,in the practical application field of emotion recognition,the data distribution between the training corpus(source domain)and the test corpus(target domain)is very different due to the different data collection environment and equipment.In this paper,cross-corpus speech emotion recognition is studied,that is,training set and test set come from different databases.In addition,the research of cross-corpus speech emotion recognition mainly includes two aspects:feature and classifier model.The main work of this paper is as follows:(1)A Target-adapted Subspace Learning model is proposed for cross-corpus speech emotion recognition.This method projects speech features into label space by finding a projection space,so as to establish the relationship between source domain and target domain,and more effectively reduce the difference of feature distribution.In order to obtain a more efficient projection matrix,l1 andl2 norms are used as the regular terms of this model.Finally,the IS09 and IS10 feature sets in the INTERSPEECH emotional challenge are extracted,and the model is validated in three public databases(Emo DB,e NTERFACE and AFEW4.0).Then compare the proposed model with the existing cross-corpus methods,the results show that the proposed method is valid,and IS09 feature set is better than IS10 feature set in this experiment.(2)Using the Long Short Term Memory Network of deep learning to work the Target-adapted subspace learning model.The Target-adapted Subspace Learning model proposed in this paper has been proved to be superior by experiments.In the study of cross-corpus,more effective features are also the research focus.In this paper,the IS09 features extracted from traditional machine learning are further taken as network input,and the corresponding loss function is designed according to the proposed model,and then use Long Short Term Memory Network to classify the classes of samples.Finally,the model is compared with other domain adaptation models.The experimental results show that the model can achieve better results.
Keywords/Search Tags:Cross-corpus Speech Emotion Recognition, Transfer Learning, Subspace Learning, Target-adapted, Long Short Term Memory Network
PDF Full Text Request
Related items