Font Size: a A A

Research Of Key Technologies For Pronunciation Quality Evaluation Of Patients With Speech Disorders

Posted on:2023-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:P YangFull Text:PDF
GTID:2544306830986179Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In daily life,speech disorders will seriously affect people’s normal communication,productivity,and life quality.At present,how to determine dysphonia and the evaluation of pronunciation quality for the patient depends most on the auditory analysis capabilities and experiences of the doctor.However,it will bring a series of problems,such as high cost,low practicability,strong subjectivity,and so on.Using computer technology and signal processing technology to analyze and study pathological speech,can be more convenient and effective in clinical medicine to help early diagnosis of speech disorders and speech rehabilitation training for patients,which has important research significance.This thesis focused on the key technical issues of speech quality evaluation for patients with speech disorders.Effective solutions to the problems about feature extraction,detection and quality evaluation of pathologic speech are also proposed.The main contributions of this thesis include:(1)To compare the pathological and normal vowels,the basic acoustic characteristics,nonlinear chaotic characteristics and spectral characteristics are analyzed,and further,the pronunciation differences between normal people and patients can be investigated.In different frequency bands,pathological and normal speeches have different energy distributions.According to the differences,a sub-band frequency spectrum center(SSC)is improved for pathological speech detection.For the pathological vowels /a/,/i/ and /u/ on the SVD data set,after using SCC features,the correct detection rates of support vector machine(SVM)model are 68.89%,67.39% and 67.34%,respectively.According to the experimental results,the SCC features can help the model to achieve a better detection performance,compared with other acoustic features.(2)A feature fusion method based on radar map is proposed.According to the method,radar map is used to represent the acoustic features of speech.Then,the acoustic features are fused by extracting the barycenter feature of the radar map.Finally,the genetic algorithm is used to select the fused radar map features.For the pathological vowels /a/,/i/ and /u/ on the SVD dataset,after the fusion of acoustic features,the correct detection rates of the SVM model are77.19%,76.09% and 76.05%,respectively.The correct detection rate is increased by 7%,averagely,after comparing the result from the direct combination of features.(3)An objective evaluation model of pathological speech pronunciation quality based on subjective evaluation mode is proposed.The traditional model is divided into three parts.As for the pathological speech,one part is used for the extraction of pronunciation quality information,one part is for the extraction of content information,and the last part is for the information fusion.In this way,the sample information extracted by the objective evaluation model can be more accurate.For the subjective and objective scores of the test set samples,the mean absolute error(MAE)and root mean square error(RMSE)are respectively reduced by10.7% and 9.4%,while the Pearson correlation coefficient(PCC)is increased by 16.4%,relatively,compared with those of the traditional model.(4)A weighting method of mean square error loss based on sigmoid function is proposed.During the training process,using the sigmoid function,the prediction error of the samples in each batch is mapped to the weight of the sample loss function,and then,hard-to-train samples will be focused by the model.Experiments show that,after using loss weighting,compared with the unweighted results,the MAE and RMSE are relatively reduced by 9.3% and 8.1%,respectively,while the PCC is relatively increased by 4.6%.
Keywords/Search Tags:Speech disorder, Feature extract, Feature fusion, Pathological voice detection, Evaluation of pronunciation quality of pathological speech
PDF Full Text Request
Related items