Font Size: a A A

A Dual-Subspace Transfer Learning Framework For Coss-Corpus Emotion Recognition

Posted on:2020-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330578981126Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
Speech,as the most important tool for human communication,has played an important role in the field of artificial intelligence.Traditional speech emotion recognition methods are normally built upon the following assumption:both training and test sets are from the same database,indicating that the training and testing sets have the same feature distribution.However,the differences between different corpora in terms of the methods of acquisition,the different distributions of training and testing sets,and recording conditions,which will lead to the fact that the traditional speech emotion methods cannot solve the problem of cross-database recognition.Transfer learning technology has been proved to be able to reduce the differences between different domains.Therefore,our work proposes a Dual-Subspace Transfer Learning(DSTL)framework to enhance the performance of cross-corpus emotion recognition.A dual-subspace transfer learning framework is proposed in our work,which combines the common and specific information and can compensate for the shortcoming of feature-mapping based transfer learning methods.The detailed contents are as follows:(1)Our work builds a Mandarin Emotional Speech Database Portrayed(MES-P),which can be used in cross-corpus emotion recognition.These speech samples are recorded by 16 speakers(8 males,8 females)according to seven discrete emotional labels,which are then mapped to Valence/Arousal(VA)space by seven annotators based on their auditory perception and subjective judgment.Therefore,this database describes not only these emotions in discrete perspective,but also the connection between emotions from continuous dimension,which can be used for future research.(2)This paper studies the feature-mapping based transfer learning methods using different constraint criteria.According to the feature-mapping based transfer learning methods and some related works,two different algorithms:global maximum mean discrepancy and local graph embedding,are incorporated into the modified principal component analysis,to obtain three different methods for commonalities.At last,the results show that the recall of feature-mapping based transfer learning method has increased by 8.11%compared to the traditional machine learning methods.Besides,global and local constrain criteria have different recognition performance under balanced and imbalanced cases.(3)Our work proposes a dual-subspace transfer learning framework.The term"dual-subspace" refers to:a)common subspace:the feature-mapping based transfer learning methods are used to learn a latent common subspace,where the distribution differences can be reduced and commonalities can be preserved;b)specific subspace:to solve the problem of lacking specific information,our work proposes a mapping strategy,which can learn a specific subspace to preserve specific information of the source and target domains.Therefore,this framework can be used to improve feature-mapping based transfer learning methods by adding the specific information.Finally,the results show that the recalls of the improved methods are 3.05%higher than that of the baseline methods,and even up to 61.67%.
Keywords/Search Tags:Speech Emotion Recognition, Transfer Learning, Maximum Mean Discrepancy, Graph Embedding
PDF Full Text Request
Related items