Font Size: a A A

Research On Emotion Recognition Method Based On Multi-Domain Feature Fusion Of EEG Signals

Posted on:2024-01-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:R LiFull Text:PDF
GTID:1520307079488994Subject:computer science and Technology
Abstract/Summary:PDF Full Text Request
Affective computing is a fundamental technology and an important prerequisite for achieving human-machine interaction and machine intelligence.It is an essential direction for the development of artificial intelligence,as only when machines accurately recognize human emotions can affectively interaction with humans be established.Electroencephalogram(EEG)signals can directly reflect the state of brain activity,making emotion recognition based on EEG signals closer to the essence of human emotion recognition research.However,EEG signals have the characteristics of complexity,non-stationarity,and non-linearity,which limit the recognition performance and generalization ability of emotion recognition methods.Therefore,how to fully utilize the characteristics of EEG signals to obtain more discriminative data representations;and how to improve the recognition performance of emotion recognition models,and construct emotion recognition models with good cross-subject generalization ability to meet the needs of real applications,are the challenges faced by EEG-based emotion recognition research.To address these challenges,this study from the perspective of multi-domain feature fusion of EEG signals,fully explored the complementary and discriminative data representations of EEG signals,and constructed emotion recognition models with good performance and generalization ability.The main work and research results of this study are as follows:(1)Proposed an ensemble learning model,MOSNK,for emotion recognition based on multi-feature fusion and multi-objective optimization algorithm.To comprehensively characterize the EEG signals,MOSNK extracted 13 EEG time-frequency domain features and nonlinear features,and conducted feature selection to form the final feature set.Unlike existing ensemble learning methods for emotion classification,MOSNK proposed an ensemble operator Ψmto transform the discrete classification results of sub-models into continuous values ranging from-1 to 1.By utilizing the ensemble operator,MOSNK solved the classification problem using a regression approach.Moreover,to avoid the influence of hyperparameters on the ensemble learning model,MOSNK used the multi-objective particle swarm optimization algorithm to quickly find the global optimal solution,determine the weight coefficients of each sub-model,and achieve the optimal emotion classification performance.Experimental results show that MOSNK outperformed the sub-model and four commonly used ensemble learning models(Voting,Boosting,Bagging,and Random Forest)in terms of performance.In addition,through experiments in the four frequency bands of EEG signals(θ,α,β and γ),it can be found that β and γ bands achieve higher classification accuracy in a single frequency band.Overall,the emotion recognition performance of the whole frequency band is better than that of a single frequency band,and the combination of EEG signals from all frequency bands achieves the best emotion recognition performance.(2)Proposed an emotion recognition model SSTD based on the fusion of spatiotemporal features and demographic information.Unlike existing emotion recognition methods based on the fusion of EEG spatio-temporal features,SSTD implemented dynamic time window segmentation of EEG signals using single-link hierarchical clustering based on the Riemannian metric,which avoids the drawback of ignoring individual differences in traditional fixed-time window segmentation,and solves the problem of inconsistent data length in EEG experimental data analysis that requires data truncation.SSTD used a Gate Recurrent Unit(GRU)network to learn the temporal features of EEG signals and a Symmetric Positive Definite Matrix Network(SPDNet)based on the Riemannian manifold to learn the spatial features of EEG signals.In addition,considering the correlation between demographic information and emotions,demographic information was innovatively introduced,and a joint optimization network was constructed to fuse EEG temporal and spatial features and demographic information.The ablation experiment results show that the classification performance of the spatio-temporal feature fusion model is better than that of a single temporal or spatial model,the model with dynamic time window segmentation performs better than the traditional fixed time window model,and the inclusion of demographic information can improve the emotion recognition performance of the model.Compared with classical models,SSTD has better emotion recognition performance.(3)Proposed a deep learning emotion recognition model,STSNet,based on spatiotemporal-spectral feature fusion of EEG signals.STSNet innovatively constructed a4-D spatio-temporal-spectral data representation and input it into a deep network,Manifold Net,specialized for handling manifold-valued data,to learn high-level spatio-spectral features of EEG signals over time.At the same time,using bi-directional long shortterm memory(Bi LSTM)and dynamic time window based on the Riemannian metric,STSNet learned high-level temporal features of EEG signals.Finally,the two highlevel features were fused to form EEG spatio-temporal-spectral fusion features,which were used for emotion recognition with a joint training model.The ablation experiment results show that the classification accuracy of the EEG spatio-temporal-spectral feature fusion model is higher than that of the single domain EEG temporal or spatiospectral feature model,and the fusion of multiple domain features can form more complementary and discriminative EEG features,which is beneficial to improving the emotion recognition effect of the model.The comparative experimental results show that STSNet further improves the effect of emotion recognition.(4)Proposed an emotion recognition model VGGFuse Net based on unsupervised and supervised learning of deep latent feature fusion.VGGFuse Net used the Variational Auto-Encoder(VAE)to unsupervisedly learn spatio-temporal latent features from EEG signals;meanwhile,it used Graph Convolutional Networks(GCN)to extract spatial domain features of EEG signals and then used GRU to learn the spectral sequence containing spatial features obtained by GCN to obtain spatio-spectral features,and finally fused spatio-temporal-spectral features.Additionally,VGGFuse Net introduced deep metric learning,using the Triplet-center loss to effectively minimize the intra-class distance in the fusion feature space and maximize the inter-class distance,and used the Focal loss to address the problem of imbalanced class distribution in the emotion dataset.The ablation experiment results show that the model that fuses EEG spatio-temporalspectral latent features improves the classification accuracy compared to the model that uses only EEG spatio-temporal or spatio-spectral features,further demonstrating that the fusion of multiple domain features can form more complementary and discriminative EEG features.The comparative experiment results of different loss functions show that the performance of the model is improved after introducing the Focal loss and the Triplet-center loss.The comparative experiment results show that VGGFuse Net has better emotion recognition performance.
Keywords/Search Tags:Emotion Recognition, EEG, Multi-domain Feature Fusion, Ensemble Learning, Deep Learning
PDF Full Text Request
Related items