Font Size: a A A

Depression Recognition Based On Multi-modality

Posted on:2022-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y M HaoFull Text:PDF
GTID:2504306611994719Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
In recent years,suicide events caused by depression have been frequently reported in the media,and "early detection and early treatment" is considered the best treatment plan for the disease,which indicates the necessity of early screening for depression.The traditional diagnosis of depression mostly depends on self-examination scales and doctor interviews,but this method is limited by the professional level of clinicians and the current uneven distribution of domestic medical resources.The rapid development of artificial intelligence provides a new solution for depression identification.Depressed patients tend to be negative and world-weary in language,frown and smile less in expression,and speak slowly and pause more in voice.It is a trend to assist the identification of depression through sentiment analysis.There have been some studies to detect depression by analyzing social texts,speech signals or facial images,which has the advantages of convenient collection,cheap equipment,and non-invasiveness.Due to the various manifestations of depression,the single feature-based depression recognition does not obtain sufficient information,resulting in inaccurate recognition.Therefore,this paper designs a multimodal depression recognition system based on text,voice signals and facial images.The specific work is as follows:(1)Text sentiment analysis,the paper introduces the fourth paradigm(Prompt project)to perform sentiment analysis on the response texts in the user interview process.Different from the commonly used fine-tuning pre-training model to adapt to downstream tasks,the Prompt project reconstructs the data of downstream tasks to adapt to the pre-trained model,making it solve tasks during pre-training with the help of text prompts,reuse the effects of pre-training models,saving training time and reducing overhead.(2)Facial expression analysis,the paper uses the ResNet101V2 model to analyze the facial expressions of users captured during the interview process.The ResNet101V2 model adds regularization to each layer of the ResNet101 model,which effectively reduces the size of the parameter values,reduces the complexity of the model,is easier to train and has stronger generalization,and performs better in image classification tasks such as facial expression recognition.it is good.(3)Speech sentiment analysis,the paper uses the BiLSTM model to perform sentiment analysis on the user’s speech signals obtained during the interview process.BiLSTM is composed of forward LSTM and backward LSTM.It has long-term memory function and the function of capturing bidirectional emotional information.It has great advantages in the application of sequence information modeling such as speech emotion analysis.(4)Multi-modal fusion,the paper uses the Attention algorithm to fuse multi-modal features.The Attention mechanism continuously adjusts the weight of each modal,and multiplies it with the corresponding feature vector,and finally cascades to obtain the fusion feature vector.The Attention model effectively realizes the complementarity and multimodal contribution calculation of multimodal information,and ensures the rationality and accuracy of multimodal information fusion.The system can be used as an auxiliary tool for early screening of depression,with high efficiency,convenience,and is not constrained by time and space,which can greatly improve the diagnostic efficiency of doctors.
Keywords/Search Tags:Depression, Text features, Facial expressions, Voice features, Multi-modality
PDF Full Text Request
Related items