Font Size: a A A

Research On Recognition Of Depression Based On Speech Signal

Posted on:2024-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:N Y LiuFull Text:PDF
GTID:2544307100962089Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Depression is a common mental illness,which can make people in a state of low mood and decreased energy for a long time.Some patients have psychotic symptoms such as self-injury and hallucinations,and in severe cases,it can lead to suicide.It not only affects the quality of life and happiness of individuals,but also has a negative impact on family,work and society.Traditional diagnostic methods mainly rely on clinicians to interview patients or use self-examination questionnaires,but this method has certain limitations,such as strong subjectivity and lack of objective diagnostic indicators,leading to inaccurate diagnosis.Therefore,finding an objective auxiliary diagnostic method is an important part of the diagnosis of depression.As an important form of human expression information,speech signal contains a wealth of physiological and psychological information.Clinical studies have found that patients with depression speak more slowly and in a low tone than normal people,and they hesitate and pause more when speaking.In recent years,with the development of speech recognition technology,researchers can collect speech signals and extract the above speech features.At the same time,the development of artificial intelligence technology can analyze and understand the context information of speech.This makes it possible to use speech recognition technology to identify and diagnose depression in this thesis.In this thesis,depression recognition based on speech signals is carried out.The main work is as follows:(1)The corpus of depression patients was collected.A total of 157 Chinese subjects(76 cases and 81 controls)were recruited in this study.Word reading experiments were designed to induce rapid emotional changes in the participants.Words were composed of positive,neutral and negative parts,and a speech was collected from each subject.Thus,a speech database for identification of depression was constructed.(2)Correlation analysis of speech low-dimensional features.In order to study the difference of speech features between patients with depression and normal people during reading words with different parts of speech,384 low-dimensional features of the experimental speech were extracted from the collected speech,and the Hamilton Depression Scale(HAMD)scores of the subjects were analyzed by Pearson and Spearman correlation analysis with low-dimensional features.The results showed that the differences in the severity of depression were mainly reflected in the changes of three eigenvalues: Mel-Frequency cepstral coefficients,voice Prob and Zero-crossing rate of time signal.(3)Depression recognition model based on speech features.Firstly,the data is preprocessed,including the completion of speech and the normalization of data.Secondly,an architecture based on attention global perception gating was constructed for speech recognition of depression.Firstly,convolutional neural network was used to extract the deep spectral features of the signal,and then multiple parallel g MLP gating modules(called Multi-mlp in this thesis)were used to connect and integrate the local speech features.Each gating module contains a Global Connection Unit(GCU).The dot multiplication operation in this unit enhances the feature fusion of cross-channel dimensions,enhances the perceptual communication between local speech features,and finally obtains the important emotional information in the global through the attention layer.Compared with the traditional speech recognition methods for depression,the proposed model obtained the highest accuracy and F1 score,which verified the effectiveness of the model.
Keywords/Search Tags:depression detection, speech emotion recognition, deep learning, speech features
PDF Full Text Request
Related items