Font Size: a A A

Research And Design Of Parkinson’s Disease Auxiliary Detection Scheme Based On Multi-type Speech Data

Posted on:2024-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:C Y WangFull Text:PDF
GTID:2544307136492694Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Parkinson’s disease presents early in the course of the disease with speech disorders such as unstable pronunciation and weakened voice quality.In order to analyze the subjects’ speech abilities,experts have designed various types of language materials based on the physiological phenomena mentioned above,including sustained vowels,repeated syllables,and contextual dialogues.Many researchers have employed machine learning techniques to analyze the speech of subjects reading the aforementioned language materials,successfully conducting speech-based Parkinson’s disease detection.However,the influence of non-pathological factors such as differences in the collection environment,individual variations,and the subjects’ native language introduces more confounding factors to machine learning models.This leads to a decrease in the detection accuracy of the models and renders them unsuitable for detection in cross-language scenarios.Therefore,it is necessary to research and design more efficient and stable disease detection models that eliminate the influence of non-pathological factors and serve a broader population.Given that the current single-type language materials cannot comprehensively reflect the subjects’ vocal conditions and are susceptible to factors such as differences in collection environment and individual variations,resulting in model misjudgment,this thesis first analyzes the multi-source speech data obtained from multiple types of language materials.This data can provide more comprehensive pathological information and help eliminate the influence of non-pathological factors.To fully utilize the multi-source speech data obtained from various types of language materials,this thesis proposes a multi-source speech information fusion model for assisting in Parkinson’s disease detection.The model learns unique information and shared information from each individual source of speech data through multiple branches,thereby comprehensively extracting the pathological information carried by the multi-source data.To facilitate information interaction in the multi-source speech data,this thesis introduces a multi-head attention mechanism in the proposed model for more fine-grained information fusion.To further ensure the extraction of unique information and shared information,the model imposes orthogonal constraints on them,resulting in better extraction of pathological information.Multiple comparative experiments on a self-collected dataset demonstrate that the proposed model outperforms models based on single-source speech data in terms of accuracy,sensitivity,F1 score,and other performance indicators for Parkinson’s disease detection.Moreover,the effective integration of shared information and unique information enables the proposed model to exhibit better performance than other information fusion models.The current Parkinson’s disease speech dataset has the characteristics of a small sample size and subjects from a single language population.Previous studies have also found that different language populations are affected by differences in pronunciation habits,and the probability distributions of different language speech data are inconsistent.To avoid the impact of language differences which would render the model unable to perform adequately in cross-language scenarios,this thesis proposes a cross-language Parkinson’s disease detection model based on transfer learning.First,the model combines self-attention encoders and a multi-layer neural network in a cascade to form a feature extractor,which is used to extract high-level semantic representations of speech and decouple speech features into two vectors.Then,the model incorporates a dual adversarial training module,where the feature extractor participates in two target tasks with inconsistent objectives through the output of the two vectors,explicitly separating domain-invariant pathological information from domain information.Ultimately,this approach reduces the differences between multi-language speech data and extracts domain-invariant pathological information from cross-language speech data.Through training the model on a self-collected dataset and the publicly available Max Little dataset,experimental results demonstrate that the proposed model can achieve high detection accuracy in cross-language scenarios.Compared to traditional models,it shows improvements in accuracy,sensitivity,F1 score,and other performance indicators.
Keywords/Search Tags:Parkinson’s disease, speech signal processing, multi-source speech data, cross-lingual analysis, deep learning
PDF Full Text Request
Related items