| Emotion recognition is a research hotspot in the field of emotion computing,which has wide application value in human-computer interaction,medical and health monitoring,intelligent lie detection and so on.Compared with facial,behavior,voice and other expressions of external emotions,physiological signals regulated by the nervous system and physiological mechanism can more objectively express real emotions.Therefore,this thesis carries out emotion recognition research based on multi-modal physiological signals,aiming to mine discriminative multi-dimensional emotional features between multi-modalities,so as to improve the effect of emotion recognition model.The main research contents include:Aiming at the problems such as insufficient information interaction between multiple physiological signals and multi-modal redundancy,the fusion mechanism of different multiple physiological signals was studied,and a mixed fusion emotion recognition model based on cross-modal attention mechanism was proposed to maximize the use of the representation complementarity between signals.Firstly,through the comparison of a number of experiments,high-quality physiological signals with high contribution to emotion recognition were selected.Secondly,an appropriate identification network is selected according to the characteristics of different modes to achieve the initial fusion of decision level.At the same time,the cross-modal feature fusion module is used to extract the interaction features between the multiple modes.Among them,GRUs are used to effectively capture the characteristics of the relationship between time sequences and aggregate the information.The cross-modal attention mechanism CMA is used to mine the complementary features between physiological signals,and the correlation between one mode and another mode is estimated by using one mode.Finally,the results of each module are fused flexibly at the decision level.The experimental results show that the accuracy of the proposed model is 92.25% and 91.56% for binary and quadripartite classification tasks,which confirms the feasibility and reliability of the model.Aiming at the problems of high redundancy,long time and insufficient consideration of the spatial correlation of multi-mode physiological signals in feature extraction of physiological signals,this thesis studies the spatial correlation of multimode physiological signals and proposes an end-to-end efficient lightweight emotion recognition model SC-DTCN.Firstly,multi-channel data sets are constructed to study emotional features in the temporal-frequency domain while considering the correlation between channels,which lays a foundation for the extraction of more discriminative emotional features in the latter part.Secondly,the attention mechanism is introduced to give more weight to the effective features.Meanwhile,the dual channel network(DCNTCN)extracts the spatial,frequency band and time sequence features to fully explore the feature information contained in different dimensions.The pseudo-threedimensional convolution blocks are used to replace the ordinary convolution blocks in the dense connected blocks to improve the efficiency of model calculation and storage.Finally,the classification is completed through feature fusion.The experimental and testing results show that the proposed emotion recognition model has advantages and stability. |