Font Size: a A A

Multimodal Depression Recognition Based On Facial Expression And Pupil

Posted on:2024-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:H R LiFull Text:PDF
GTID:2544307079992639Subject:Electronic Information·Computer Technology (Professional Degree)
Abstract/Summary:PDF Full Text Request
Depression is one of the most serious mental health problems in modern society,with a global total of 350 million cases and the number of patients increasing year by year.Depression can affect a person’s work,study,and daily life,and severe depression can even lead to suicidal tendencies.Current clinical practice relies entirely on self-reporting and clinical evaluation,which carries the risk of a series of subjective biases.With the development of emotional perception and machine learning,computeraided diagnosis has played an important role in objective evaluation.In recent years,researchers have studied depression recognition by using behavioral and physiological data such as facial expressions,language,electroencephalography,and eye movement.It has been found that single modalities may have information loss or incompleteness,resulting in low recognition accuracy,while multimodal fusion analysis can effectively combine multiple modalities of information,supplementing and complementing each other,overcoming the shortcomings of single modalities,and improving the performance of depression recognition.Facial expressions are one of the important ways for human emotional expression,conveying various information about human emotional states,while pupil dilation and constriction also provide important information for depression recognition.Therefore,this paper,based on self-report assessment scales and professional diagnosis by doctors,used images of different types of stimuli to induce emotional changes in subjects,and finally obtained facial expression and pupil data for depression subjects and healthy controls.Based on the study of facial expressions and pupil data,this paper further explores the method of multimodal fusion based on facial expressions and pupils.The main work and innovation points are as follows:(1)In order to construct an effective depression recognition model,data collection and processing work were carried out in this paper.Through the recruitment of subjects,the selection of stimulus paradigm pictures,the research of eye tracker equipment and other steps,the video data were collected and used for subsequent model construction and training tasks.The data set was induced under positive,neutral and negative stimulus paradigms,and the facial expression and pupil changes of the subjects were recorded respectively.This paper uses this dataset to develop and test a depression recognition algorithm based on multimodal fusion,and hopes to provide help in the auxiliary diagnosis of depression.(2)In this paper,a depression recognition algorithm based on frequency domain is proposed.The algorithm detects facial expression by estimating the pixellevel change in frequency domain,extracts the features of expression in frequency domain by Fourier transform with sliding window,and then uses classification algorithm to identify depression.The experimental results show that,The classification accuracy of the proposed method is significantly improved.At the same time,this paper found that the expression amplitude of depressed subjects was generally higher than that of healthy control group.By further comparison,in the same time period,the expression amplitude of depressed subjects showed a significant peak and was higher than that of other time periods,while the expression amplitude of healthy control group fluctuated within a stable interval.It can be shown that depressed subjects and healthy controls pay attention to information differently under the stimulus paradigm.(3)Most existing methods focus on proposing different cross-modal fusion strategies.However,these strategies often fail to fully consider the complementary properties between modalities.Additionally,introducing redundancy in the features of different modalities may not guarantee the preservation of the original semantic information during the intra-modal interaction process,this paper proposes a cross-modal fusion based on self-attention network(CMF-SN)for depression diagnosis,which combines facial expression data and pupil data and extracts information within and between modalities to construct an effective fusion model.Firstly,the expression modality and pupil modality are characterized separately,and the residual network is used to obtain the spatiotemporal structural features of the expression sequence and the one-dimensional CNN to obtain the pupil features.Secondly,the two modalities are input into the cross-modal fusion module,where self-attention mechanism is used for feature selection within modalities,allowing selected features to interact effectively within each modality.At the same time,the self-attention network is used for feature selection between the two modalities.Finally,the obtained features are fully connected for depression recognition.The ablation experiment results show that the proposed cross-modal module fully considers the complementary information of different modalities and makes full use of the interaction within and between modalities to achieve feature transfer.The introduction of self-attention mechanism and residual structure can ensure the effectiveness and integrity of information interaction.(4)This paper further explores and verifies the multimodal fusion of facial expression and pupil data at the decision level.Firstly,the algorithm proposed in the previous sections is used to extract facial expression features and pupil features in both the temporal and frequency domains.Then,several commonly used decision-level fusion algorithms are applied to fuse the two modalities for depression classification performance validation.The experimental results show that the classification accuracy of the multimodal fusion model at the decision level reaches 77.8%,which is higher than that of single modality classification,indicating the effectiveness of facial expression and pupil fusion analysis in depression recognition.In summary,this paper conducted a series of explorations on depression recognition based on facial expression and pupil data.Facial expression features and pupil features of depression subjects and healthy controls were extracted,and depression recognition was studied in both single modality and multimodal aspects.Based on the analysis of feature-level data,a cross-modal fusion model based on self-attention network was proposed,which can fully consider within-modality and between-modality information without losing any information.The paper also considered data fusion at the decision level,and the experimental results showed that the multimodal fusion outperformed the single modality,improving the accuracy of depression recognition.
Keywords/Search Tags:Depression, Facial expression, Pupil, Self-attention network, Multi-modal fusion
PDF Full Text Request
Related items