Font Size: a A A

Research On Depression Tendency Recognition Based On Speech Signal

Posted on:2022-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:H Z AnFull Text:PDF
GTID:2518306500956449Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Depression is a common mental disorder,and its main characteristics are low mood and loss of interest.Because of its high prevalence and recurrence rate,it has caused widespread public concern.At present,the detection methods for depression are mainly based on the diagnosis of some experienced doctors and depression scales.These methods are too subjective and relatively inaccurate.In addition,people's prejudice against depression makes the treatment rate and treatment effect of depression both Relatively poor.For this reason,it is particularly important to find an objective,effective and highly applicable method for detecting depression.With its non-intrusive,low-cost,and easy-toobtain characteristics,speech has gradually become a research hotspot in academia for the study of depression recognition modeling based on speech signals.A large number of studies have found that compared with the normal population,patients with depression have the characteristics of lower pitch,slower speech speed,single intonation and longer pauses in speech acoustic characteristics,which can be used as objective indicators for depression detection.depressive tendency belongs to the early stage of depression,and individuals with depressive tendency account for a higher proportion in current social life than depression group.Therefore,early intervention and emotion regulation can reduce the prevalence of depression.Based on this,this thesis first establishes a corpus of depression tendency from the perspective of psychology,uses convolution neural network algorithm to solve the problem of identifying depression tendency individuals,and uses data enhancement and other methods to expand the data for the problem of small amount of data.The main research contents and innovations of this thesis are as follows:1.Collect corpus with depressive tendency.From the perspective of psychology,using the classic experimental paradigm of psychology,three verbal methods of text reading(word reading and short text reading),interviews and picture descriptions are designed.Each type of speech includes three kinds of emotional stimuli: positive,neutral,and negative(only neutral emotions in short text reading).The voice data of 50 college students were recorded,and each participant had 10 voices,a total of 500 voices,to construct a corpus of high depressive tendency.2.Recognition of depression tendency based on speech signal.In this thesis,speech is transformed into a spectrogram,and the spectrogram is used as the input of the Convolutional Neural Network to realize the research on the recognition of depression tendency based on speech signal.On this basis,this thesis mainly studies the influence of different speech styles and different emotional states on classification.In different speech modes,the whole,male and female voice were tested respectively.Experimental results show that the effect of spontaneous speech recognition is better than that of reading speech,the picture description recognition effect is the best in the overall experiment,and the interview recognition effect is the best in the male and female experiments.In different emotional states,the whole,male and female voices are also tested separately.Experimental results show that the negative emotion recognition effect is the best in the overall and female experiments,while the neutral emotion recognition effect is the best in the male experiment.In all speech modes and emotional states,the recognition effect of women is significantly higher than that of men.3.Expanding the corpus of depressive tendency.In order to solve the problem caused by the small amount of experimental data,this article uses two data augmentation methods to increase the amount of data:(1)Three kinds of negative emotion corpus,sadness,anxiety and fear,which are closest to the speech of depressive tendency,are added to the corpus of high depressive tendency;(2)The image data augmentation is performed on the spectrogram of the speech with high depression tendency,so that a better classification effect can be achieved on a smaller sample data set.The experimental results show that the recognition rate of adding negative emotion corpus to the depression tendency corpus did not improve,but the recognition result increased by 7.63% after image augmentation.
Keywords/Search Tags:depressive tendency, speech, psychology paradigm, spectrogram, depression recognition, data augmentation
PDF Full Text Request
Related items