With the vigorous development of artificial intelligence in recent years,human-computer interaction technology has attracted more and more researchers’ attention.As an important part of human-computer interaction,emotion recognition has gradually become a hot research field.In addition,emotion recognition also shows great potential in social robots,medical treatment,education and other fields.Compared with non-physiological signals such as facial expressions,physiological signals such as EEG and EOG are more objective and real,and are not easy to be camouflaged,so they are widely used in the emotion recognition research.In the past,most studies only used single modality for emotion recognition,with low classification accuracy and poor robustness.Based on the above problems,multimodal emotion recognition research has gradually become the mainstream.There are also two issues with multimodal emotion recognition,one is how to utilize the complementarity between different modalities to obtain richer emotional information,and the other is how to construct a deep learning model to effectively extract emotion related features from multimodal signals.Based on EEG signals and several peripheral physiological signals in multimodal data sets DEAP and SEED-Ⅳ,this thesis explores multimodal EEG emotion recognition methods from four aspects:different physiological signals feature extraction,different modes feature fusion,deep learning model construction,and emotion classification.It aims-to solve single mode emotion information is not complete and the recognition accuracy is not high.At the same time,it also provides an effective way for the development of multimodal emotion recognition brain-computer interface applications.The specific research contents include the following two aspects:(1)This thesis proposes a multimodal EEG based emotion recognition method by using attention bidirectional gated recurrent unit neural network,which is represented by Mul-AT-BiGRU.At first,the attention mechanism is used to fuse three different features of two modalities including EEG signals and eye movement data to achieve global interaction between different modal features.Then,the obtained multimodal fused features are fed into the Mul-AT-BiGRU network for deep emotional feature extraction and classification.The Mul-AT-BiGRU model makes the learned deep emotion-related features more obvious and discriminative by mining the complementary relationships between different modal data,thus improves the emotion recognition performance.The proposed method was tested on the SEED-IV,a multimodal dataset,and the average classification accuracies within subjects reached 0.9519,which were improved by 0.2022,0.2004 and 0.1750,respectively,compared with three single-modal features.The average cross-subject classification accuracy of the proposed method reaches 0.6277,which also outperforms other available similar comparative methods,verifying the effectiveness and generalization of the proposed method for multimodal EEG based emotion recognition.(2)This thesis proposes a multimodal emotion recognition method based on the attention recurrent graph convolutional neural network,which is represented by Mul-AT-RGCN.The method explores the relationship between multiple-modal feature channels of EEG and peripheral physiological signals,converts one-dimensional sequence features into two-dimensional map features for modeling,and then extracts spatiotemporal and frequency-space features from the obtained multi-modal features.These two types of features are fed into a recurrent graph convolutional network with a convolutional block attention module for deep semantic feature extraction and sentiment classification.To reduce the differences between subjects,a domain adaptation module is also introduced to the cross-subject experimental verification.This proposed method performs feature learning in three dimensions of time,space,and frequency by excavating the complementary relationship of different modal data so that the learned deep emotion-related features are more discriminative.The proposed method was tested on the DEAP,a multimodal dataset,and the average classification accuracies of valence and arousal within subjects reached 0.9319 and 0.9182,respectively,which were improved by 0.0510 and 0.0469,compared with the only EEG modality and were also superior to the most current methods.The cross-subject experiment also obtained better classification accuracies,which verifies the effectiveness of the proposed method in multimodal EEG emotion recognition.The Mul-AT-BiGRU and Mul-AT-RGCN methods proposed in this thesis can better use the different physiological modals correlation and complementarity to obtain more complete emotional related feature information than single modality,improves the accuracy and robustness of emotion recognition,and solve the problem of low model generalization performance. |