Font Size: a A A

Research And Application Of Articulation Movement Features Of Disorder

Posted on:2020-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:M M YanFull Text:PDF
GTID:2404330596985788Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The dysarthria caused by cerebral palsy and other diseases seriously affects the language expression ability of patients,this has adverse effects on their daily life.With the development of signal processing technology,speech signal processing methods are widely used in pathological speech research.Pronunciation movement of articulatory organs will directly affect the accuracy and fluency of speech.At present,there are few studies on pronunciation movement features.The existing research is mainly through the extraction of pathological speech acoustic features for recognition and classification.High-quality pathological speech database is the basis of experiment.Subjective participation of dysarthria,time cost and manpower cost of recording database result in the lack of database.Moreover,the amount of patient data collected in the recorded database is very small,which leads to a great difference between dysarthria and normal people,and there is a certain imbalance.In view of the above reasons,this paper used the open and available TORGO database to compare and analyze the differences of articulation between patients with dysarthria and normal people,to study the problems existing in thepronunciation of dysarthria,to extract the articulatory features.It provides a theoretical basis for the identification and diagnosis of dysarthria.At the same time,the combination of articulatory features and acoustic features is used to identify and classify,which further improves the recognition effect.The contents of this paper are as follows:Using the TORGO database,firstly,using the fitting method of lognormal distribution and normal distribution,with 2.58 standard deviations as the threshold,the proportion of dysarthria and normal people in the abnormal range of speech movement at tongue root,tongue middle and tongue tip was compared and analyzed.And then the distribution of abnormal range of speech movement at the lip was analyzed.The study shows that the proportion of abnormal movement areas of tongue and lip in patients with dysarthria is larger than that in normal people,the difference between left and right direction of tongue pronunciation is obvious.The pronunciation movement interval of the tongue is used as feature input for recognition and classification.It is found that the recognition accuracy of the pronunciation movement interval of the left and right direction of the tongue is higher than that of the front and back,up and down directions.The accuracy rate of the three directions of articulation movement interval reachs the highest.Based on the study of the distribution of articulation motion intervals,the starting time of pronunciation and the movement velocity of articulation organs are extracted and the distribution problem is counted.In the distribution ofstarting time of pronunciation,the distribution of dysarthria is scattered,and normal people obey normal distribution.There are differences between normal people and dysarthria people in the distribution of articulation movement velocity.Finally on the basis of distribution study,the articulation features consisting of articulation motion interval,articulation movement velocity and articulation starting time are extracted.On the basis of extracting articulatory features,the acoustic features of audio files corresponding to pronunciation files are extracted: prosody features and MFCC features,and the articulatory features and acoustic features are combined to form a fusion feature.A comparative experiment is carried out based on support vector machine and random forest.The experimental results prove the validity of the articulatory features and fusion features in the recognition and classification of patients with dysarthria and normal persons.Under the same recognition model,the recognition rate of the fusion feature is higher than that of the single-acoustic feature and the single-articulatory feature,and the recognition accuracy of the combination of the fusion feature and the random forest recognition system is the highest,reaching 96.66%.Aiming at the unbalanced data set in pathological speech research,the cost-sensitive learning idea is adopted,and the cost-sensitive random forest is used as recognition model to improve the correct classification rate of articulation disorders based on the single-acoustic feature,the single-articulatory feature and fusion feature.The experimental results show that the cost-sensitiverandom forest can improve the overall recognition accuracy of the above three types of features.Moreover,the recognition accuracy of minority classes has also been greatly improved,with an increase of 19.48%.
Keywords/Search Tags:pathological speech recognition, statistical distribution, articulatory features, fusion features, cost-sensitive
PDF Full Text Request
Related items