Font Size: a A A

Research On Speech Emotion Recognition Technology Based On Multi Classification And DBN Based Onmulti Feature Fusion

Posted on:2019-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:L P MaFull Text:PDF
GTID:2518306044460054Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Speeh is one of the most important ways of communication in human daily life,which contains rich emotional information.The research of speech emotion recognition technology is of great practical significance for improving the intelligence and humanization of computers,developing new man-machine environment and promoting the development of psychology and so on.The main problems and difficulties in this thesis are as follows:(1)A certain emotional feature is difficult to express a certain emotional state.Emotion is easily affected by the speaker's tone,speed,volume,content,most emotional characteristics failed to fully consider the emotional information reflect the emotional characteristics,which can fully express what emotional state has not been very accurate theoretical basis,this has seriously affected the emotion recognition rate.(2)How to construct a suitable classification mechanism to reduce the error rate between the confusing emotional states.Some emotions show relatively similar characteristics,the traditional use of SVM for speech emotion recognition is only using one level classification to classify all emotions,resulting in a high false recognition rate between confusing emotions.(3)In the process of DBN network training,on the basis of establishing an accurate network model,how to shorten the training time.In traditional DBN training,the learning rate is constant.When reducing target error,time efficiency is not considered,and real-time performance is not enough.(4)Single modal speech emotion recognition adopts a single emotional feature information,and the recognition performance and robustness are limited.In practical applications,complex background such as noise has a serious impact on speech emotion recognition.It is difficult to accurately identify the emotional type through the single modal information of speech.The contribution and innovation of this thesis mainly include the following points:(1)Multi feature fusion based on MFCC is proposed.In this thesis,we use the fusion of short time energy,pitch frequency,formant frequency and MFCC to replace single emotional characteristics,which make up for the lack of emotional characteristics.Since the feature dimension of fusion is too high and there are irrelevant or redundant features,we use LDA dimensionality reduction method to remove redundant information,so as to achieve the purpose of improving speech emotion recognition rate.(2)A multi-stage SVM classification algorithm is proposed.The confusion matrix is introduced by the traditional method,In this thesis,the concept of confusion is introduced,and a multi-stage classification algorithm is proposed.That is to separate the emotions that are easy to distinguish,then classify the confused emotions and judge the emotional type of the speech to be recognized step by step.(3)DBN based on adaptive learning rate is proposed.In this thesis,the dynamic learning rate is added to the process of network learning.The standard of learning rate adjustment is to check whether the correction of weights is effective or reduce the target error.If it is reduced,we can increase one volume to it.On the contrary,if it is not reduced,we should reduce the learning rate.(4)Multimodal fusion of speech signals and face images is proposed.We extract features that can reflect voice emotion and facial expression respectively,and then use two multimodal information fusion strategies of feature level fusion and decision level fusion to realize multimodal emotion recognition.The experimental results show that the emotion recognition rate of the fusion of the speech and the face is higher than the emotion recognition rate of the single mode.The research results mentioned above will provide a comprehensive theoretical reference and support for the research field of speech emotion recognition.
Keywords/Search Tags:emotion recognition, multi feature fusion, multilevel classification, deep confidence network, multimodal fusion
PDF Full Text Request
Related items