Font Size: a A A

Research On Transfer Subspace Learning For Speech Emotion Recognition

Posted on:2022-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:W J ZhangFull Text:PDF
GTID:2518306488966639Subject:Engineering
Abstract/Summary:PDF Full Text Request
In our daily life and various real applications,as a communication carrier,speech plays an important role.In recent years,with widespread existence of various emotions in life,speech emotion recognition has become an attractive research topic in the fields of affective computing and man-machine interaction,which is important to promote the development of artificial intelligence.The main goal of speech emotion recognition is to classify speech signals into various emotional states,such as happiness,anger,fear,and sadness.It has been proven valuable in many man-machine interaction applications.However,there exist some disadvantages in current speech emotion recognition methods.Firstly,the dimensionality of speech features is high,which has many redundant and noisy data.Secondly,most current speech emotion recognition methods are carried out on a single corpus.In reality,these emotional speech data are often collected under different scenarios or environments,such as different languages,noises,speakers,which leads to many differences for speech data.Therefore,this paper mainly combines transfer and subspace learning into a joint framework to study cross-corpus speech emotion recognition problem,and utilize transfer subspace learning is to reduce the dimensionality of the data,and transfer the data and training model simultaneously.The main research content is as follows:To tackle cross-corpus speech emotion recognition problem,in the first work,we propose a novel transfer sparse discriminant subspace learning method.Our goal is to learn a corpus-invariant proection simultaneously while knowledge transfer.We learn a common feature subspace of different corpora by introducing a discriminative subspace learning and an?2,1-norm constraints,which can obtain the most discriminative features across different corpora.In addition,we construct a novel nearest neighbor graph as the distance metric,in which the similarity between different corpora can be measured.In the second work,we present a novel transfer learning method,called joint transfer subspace learning and regression(JTSLR),which performs transfer subspace learning and regression into a joint framework.JTSLR can learn a common subspace by introducing a discriminative maximum mean discrepancy as the discrepancy metric,which can reduce the feature distribution divergence between different corpora.Then,we put forward a regression function in this latent subspace to describe the relationships between features and corresponding labels.Moreover,we present a label graph to effectively transfer the knowledge gained from source corpus to target corpus.This paper conducts extensive experiments on three popular emotional datasets,including Berlin,e NTERFACE,BAUM-1a.The results show that,compared with traditional methods and some state-of-the-art transfer learning algorithms,our method can achieve competitive performance for cross-corpus speech emotion recognition tasks.
Keywords/Search Tags:speech emotion recognition, cross-corpus speech emotion recognition, subspace learning, transfer learning
PDF Full Text Request
Related items