Font Size: a A A

Research On The Detection Of Parkinson’s Disease Based On Self-Supervised Speech Feature Extraction

Posted on:2024-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:M Q YangFull Text:PDF
GTID:2544307136993039Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Parkinson’s disease is a chronic degenerative disease of the nervous system that,although not fatal,severely affects the patient’s ability to work and quality of life.Currently,the primary method of Parkinson’s disease assessment is clinical diagnosis,which requires a lot of time and effort from both doctors and patients to complete the assessment,and the assessment results are easily affected by the ups and downs of the patient’s symptoms and the subjectivity of the physician.In contrast,speech-based assessment of Parkinson’s disease has the advantages of being non-invasive,easy to collect,less costly and non-invasive.Traditional speech-based methods for Parkinson’s disease assessment are often based on the calculation of parameters for patient’s speech features(e.g.,jitter,shimmer,etc.)combined with machine learning models to detect and assess the disease.However,the above features may not fully reflect all pathological phenomena,thus affecting the accuracy of detection and assessment.In addition,supervised learning methods often require large amounts of labeled data for pre-training.However,due to practical difficulties such as uneven acquisition cycles,suboptimal acquisition environments,and time-consuming and laborious labeling of speech data,the relatively small amount of speech data from Parkinson’s disease patients often leads to insufficient training data.To address the above issues,this thesis focuses on the problem of self supervised speech representation learning,using large-scale unlabeled speech data to learn the underlying structural representation of Parkinson’s disease speech data,and thus achieve Parkinson’s disease detection.The main work of the paper is as follows:In response to the above issues,in order to better extract pathological information from the speech of Parkinson’s disease patients from a small amount of labeled sample data and improve the accuracy of evaluation and detection,the paper first discusses how to use speech to use a self supervised model for Parkinson’s disease detection.First,Mel spectrogram features are extracted from the original speech of Parkinson’s Disease patients,and the global temporal representation with rich pathological features is obtained.Then,partial Mel spectrogram features are masked,and the masked parts are reconstructed by masking self-supervised model,so as to learn a higher-level representation of speech features of Parkinson’s Disease patients.In order to solve the problem of the scarcity of Parkinson’s Disease speech data,the masking self-supervised model will first be pre-trained on Libri Speech public data set,and then based on the idea of transfer learning,the pre-trained model will be fine-tuned and weighted summed on Parkinson’s Disease speech data.Thus,the feature representation learning performance of the proposed masking self-supervised model can be improved.Finally,random forest classifier and support vector machine classifier are used to classify the extracted speech features to achieve the detection of Parkinson’s disease.The results show that,compared with the traditional Mel spectrogram feature detection method and other classical self-supervised feature extraction methods,the proposed method has significantly improved the Accuracy,True Positive Rate and True Negative Rate performance.To take advantage of the different prior knowledge learned by various different self-supervised models,the paper proposes a multiple integrated self-supervised Parkinson’s disease detection scheme based on majority voting.First,the respective supervised models are pre-trained on a large publicly available speech dataset in the source domain(e.g.,the Libri Speech speech dataset).Then,to better fit the specific Parkinson’s disease classification task,all layers of the self-supervised model are fine-tuned on an intermediate dataset(e.g.,a publicly available vowel dataset)that contains enough speech data and is semantically closer to the target domain.Then,Parkinson’s disease speech detection is used as the detection task in the target domain,the self-harvested Parkinson’s disease speech dataset is used as the training sample in the target domain,and the parameters in the self-supervised model after the intermediate domain fine-tuning are used as the initialization parameters of the target domain model to fine-tune the target domain model twice.Finally,the decision results of each classification model are integrated using the majority voting integration method to further improve the classification detection performance.The experimental results show that the detection accuracy can be effectively improved by introducing integration learning without increasing the amount of existing data compared to a single self-supervised method without integration.
Keywords/Search Tags:Parkinson’s disease, speech signal processing, self-supervised learning, feature extraction, integrated learning
PDF Full Text Request
Related items