Font Size: a A A

Research On Deep Learning-Based Bimodal Emotion Recognition In Open Domain Dialogue Systems

Posted on:2022-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z HuoFull Text:PDF
GTID:2518306335967929Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the progress and development of science and technology,people's pace of life is faster and faster,many people have a sub-health state of mind,and more and more people commit suicide because of depression every year.At present,many schools have begun to pay attention to students' mental health problems,but the number of students is far greater than the number of psychological counseling teachers.Most of the off campus psychological counseling institutions are expensive and mixed,so it is difficult to carry out timely and effective psychological counseling for students.With the development of social media and the advent of the era of big data,the research on emotion analysis makes the realization of private emotion counseling program possible.If we can identify the user's emotion in time,we can seek help from the outside world when the emotion exceeds the critical value to avoid the tragedy.According to the classification of emotion,the research types of emotion analysis are mainly divided into discrete type and dimensional type.The discrete type divides emotions into two(commendatory)or more(happy,angry,sad,etc.)categories.The dimensional type constructs a space according to the polarity and intensity of emotions,and each point represents an emotion.The common research methods of sentiment analysis are based on dictionary rules,machine learning and deep learning.The method based on dictionary rules firstly constructs an emotion dictionary,classifies the words in the emotion dictionary,matches the sentences to be judged with the words in the emotion dictionary,and judges the final emotion classification according to the designed rules;The machine learning method needs to extract the feature values of the corpus manually,and use the machine learning algorithm to classify the corpus according to the extracted features;The method based on deep learning can design multi-layer network to classify the emotion of corpus,and avoid the tedious feature engineering.Most of the text sentiment analysis uses discrete sentiment analysis method,in which the Chinese text needs to split the document into a series of words,namely word segmentation.Because the short text in the open domain dialogue system belongs to colloquial expression,it is necessary to filter the stop words after word segmentation.Then the word vector features are extracted as the input of deep learning model,and the emotion classification is carried out.In order to enhance the high-frequency part of the speech signal,pre emphasis technology is usually used to enhance the signal,and then it is divided into frames and windowed to avoid Gibbs phenomenon.Then the prosodic features,spectral features or sound quality features of speech are extracted as emotional features,and finally the classifier is used for emotional classification.Due to the complexity of human language and emotion,single-mode information such as text or voice can not accurately judge the user's real emotion,so more and more scholars study multi-mode emotion recognition technology.Due to the rise of deep learning in recent years,the research of multimodal recognition algorithm has also made great progress.In this paper,the dual-mode sentiment analysis method of text speech is studied.The text sentiment analysis model based on deep learning and the speech sentiment analysis model based on deep learning are optimized respectively.In the text sentiment analysis part based on deep learning,the joint model of convolutional neural network and bidirectional long-term and short-term memory network is used,After using the attention mechanism to improve the performance of the model,a layer of attention mechanism is added.By comparing the single deep learning network with the deep learning network with dimension adding attention mechanism,it is verified that the CNN bilstm network with two-level attention can achieve better emotion classification.In the part of speech emotion analysis based on deep learning,the performance of each combination model and the model with attention mechanism is also compared.Considering the characteristics of high dimension of speech emotion feature and long training time,attention mechanism is added to short-term memory network and bidirectional long-term memory network to reduce the dimension,The performance of the optimized model is similar to that of the combined model,and the training time is greatly shortened.After that,we build a bimodal sentiment analysis model.Considering the influence of expression on different subjective and objective categories of corpus,this paper adds the subjective and objective text classification mechanism to the speech text bimodal sentiment recognition model.Through the subjective and objective classification operation on the text corpus in the data set,we can determine the weight parameters of the classification results of the two modes in the decision-making level.The experimental data show that the recognition accuracy of the bimodal combination model with subjective and objective classification mechanism is higher than that of the bimodal combination model without subjective and objective classification mechanism on CASIA data set and iemocap data set.
Keywords/Search Tags:emotion recognition, Deep learning, Multimodal fusion, Long and short-term memory network, Attention mechanism
PDF Full Text Request
Related items