Font Size: a A A

Research On Classification Of The Pathological Voice Based On Deep Neural Networks

Posted on:2018-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:S M XieFull Text:PDF
GTID:2428330596953017Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the accelerated pace of life,voice disease,as a result of extensive or improper use of voice and bad habits,are becoming more common.The pathological voice diagnosis method which is combined by the speech signal analysis and pattern recognition technology has advantages such as objectivity,non-invasiveness in the detection of pathological properties and severity.So the pathologic classification ranking severity of disordered voices has become a current research focus.This paper,based on the large-scale Chinese pathological voice database,combined with voice signal processing and deep leaning,completes the features extraction from pathological voice and classifier modeling,to develop a comprehensive and automatic pathologic classification system for ranking severity of disordered voices.The specific content of this paper are as follows:(1)Based on large-scale Chinese pathological voice database,the acoustic features of pathological voice are analyzed.The acoustic features from sustained vowels are extracted to construct Basic Acoustics Feature Set(BAFS),including 69-dimensional features.And a total of 44 dimensionality parameters including Melfrequency Cepstral Coefficients(MFCC),Smoothed Cepstral Peak Prominence and Long-Term Average Spectrum is extracted from continuous speech.(2)This paper applies modulation transform to sustained vowel,then proposes a new feature based on modulation spectrum(MS)to characterize the voice signal.Through the development of the four-class assessment system,MS has good ability to express pathology compared with MFCC and BAFS,and it improves the accuracy of four-class assessment system.(3)When continuous speech utterances are used for voice assessment,more sophisticated methods of pattern clustering and modeling are needed to cope with the large variation of acoustic parameters,so DBN-DNN is modeled in this paper.The outcomes have been compared to those obtained with a baseline setup using the classic GMM as classifier,and experimental results show that the present four-class assessment system using the DBN-DNN as a classifier can contribute to achieve a superior classification of pathological voice quality.Moreover,this paper completes the synthesis of 44 dimensionality spectral features,and better performance is obtained with the DBN-DNN when input features are multidimensional synthetic features.(4)This paper makes a research on how to reduce the risk of misclassification between adjacent groups of pathological voices.Through multimodal analysis of the the dysphagia's voice,the kinematic features of articulators have good ability to differentiate the normal from mild category.Aiming at misclassification between mild and moderate category,this paper explores the correlation between the degree of pathological voice and the recognition rate of ASR.Based on ASR system,this paper proposes a method to predict the severity of voice disorder,and the mild,moderate and severe categories are highly distinctive.
Keywords/Search Tags:pathological voice, pattern recognition, deep neural networks, modulation transform
PDF Full Text Request
Related items