Font Size: a A A

Research On The Key Problems Of Speech Emotion Recognition Based On Fuzzy Cognitive Maps

Posted on:2018-10-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:1318330569988983Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of artificial intelligence technology,the requirements of human-computer interaction interface are getting higher and higher.As one of the key technologies of intelligent human-computer interaction,speech emotion recognition has become a research hotspot.The computer can sense the emotional state of mankind,so that the machine and human communication more natural,more real,has been the pursuit of the industry goals.With the deepening of speech emotion recognition research,in many areas such as fatigue testing,disease diagnosis and distance teaching have been more and more applications.Therefore,the study of speech emotion recognition has important theoretical and practical value.This paper focuses on the analysis and research of the key issues in speech emotion recognition system,such as the establishment of emotional speech database,feature extraction,recognition model construction,fusion recognition network and continuous dimension emotion analysis.First established a closer to the real emotions of the excerpted speech database TYUT2.0;in the aspect of feature extraction,Hilbert-Huang Transform and Teager energy operators are combined in different ways to extract three new nonlinear speech emotion spectrum features;In the recognition model,Fuzzy Cognitive Maps is proposed to construct the speech emotion recognition network,and the value of the emotional Pleasure-Arousal-Dominance is used to determine the weight between the emotional categories in the network.In order to further improve the recognition rate,the author analyzes and studies the emotional speech from the aspects of feature fusion and decision fusion,and puts forward the method of weight adaptive FCM network.Finally,the speech emotion characteristics are analyzed from the perspective of continuous dimension,and the PAD value of emotion is obtained by weighted fusion prediction.The main research contents and innovations are as follows:(1)Established an emotional speech database.By analyzing and comparing the various methods of establishing the speech database,we choose the extraction method of intercepting the emotional statement from the radio drama to construct the speech database,and put forward a comprehensive fuzzy evaluation method to evaluate the validity of the first established emotional speech database and sentence filtering,finally get more effective emotional speech database for later research.(2)By using the different combinations of HHT and Teager energy operators,three different emotional features are extracted: Hilbert marginal spectral coefficients,Hilbert-Teager energy marginal spectral coefficients(HTMC)and EEMD-based Teager energy Mel frequency spectral coefficients(ETMC).Hilbert marginal spectral coefficient is based on the marginal spectrum,through the Mel filter to calculate a set of cepstral coefficients.The HTMC is the first HHT transform of the emotional speech signal,including EEMD decomposition and Hilbert transform.The obtained Hilbert spectrum is extracted from the Teager energy and its marginal spectrum is calculated.Finally,the marginal spectral coefficients are calculated.For the ETMC,it is the first EEMD decomposition of the emotional speech signal to obtain a series of IMF components,and then extract the Teager energy for each order IMF component,calculate the Mel frequency cepstral coefficient.The three new features are used for emotional speech recognition,and compared with the traditional features,the experimental results prove the validity of these three features.(3)Based on the FCM network,a classification model dedicated to emotion recognition is proposed.Considering the special factors of emotion in speech emotion data,the FCM network model for speech emotion recognition is constructed,and the theoretical analysis and formula derivation are carried out.The PAD emotional dimension spatial model is used to characterize the emotional attributes in FCM model value.The simulation results show that the combination of PAD and FCM is not only better than the traditional recognition network,but also the recognition rate is improved.(4)Fusion feature and weight adaptive algorithm fusion FCM network are proposed for speech emotion recognition.Aiming at the situation that the single feature and the single classifier can not fully express all the information and the recognition rate of the emotion,the fusion algorithm is used to study the speech emotion recognition.The fusion feature is to linearly fuse the proposed new features and different types of classical speech features,and get the fusion feature vector for identification.The weight adaptive algorithm is based on the difference of the recognition network of different features to allocate the fusion weights of each classifier,and then weighted the results of each classifier to improve the recognition performance of the classifier.(5)This paper analyzes and studies the speech emotion recognition from the perspective of continuous dimension,and proposes the use of hesitant fuzzy sets to predict the emotional PAD.Firstly,we use different features to identify the emotion and map the recognition result of the network to the PAD emotion space.The PAD value of the emotion is obtained,and then the correlation analysis to obtain the correlation coefficient between different features and P,A,D dimensions;Secondly,two different recognition networks are selected to identify the emotional speech.The probability type output of the two networks is used to construct the hesitation fuzzy matrix and combine the correlation coefficient obtained in the previous step,decision fusion to obtain the similarity between emotion and P,A,D.Finally,the weighted fusion predicts emotional PAD value is used to analyze and validate from the perspective of probability and spatial distribution,and the correlation between the predicted value and the PAD value of the emotion itself is calculated.The experimental results show that the predicted data is valid.
Keywords/Search Tags:Hilbert-Huang transform, Fuzzy cognitive maps, feature fusion, decision fusion, continuous dimension emotion analysis
PDF Full Text Request
Related items