Font Size: a A A

The Research Of Extracting Of Pathological Voice's Characteristics And Recognition Based On Wavelet Transformation And Gaussian Mixture Model

Posted on:2009-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y P YuFull Text:PDF
GTID:2178360245959614Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The recognition of pathological voice is the development of the computer technology and coming into the field of medicine. It will be make the important contribution to carry out voice examination painless, scatheless technique and the objective diagnosis in the clinical application. Because of the complexity of voice, we could not make the diagnosis which only depended on one or several parameters based on acoustic parameters. Finally, we were unable to get rid of doctor's subjective experience judgment. In order to realize the truly objective examination, many researchers have already done a great deal of work, and have made the significant contribution for the intelligence recognitions and objective evaluation of pathological voice. But there were some insufficiencies for the clinical practice.Based on the work of predecessors, the paper has studied on the system of pathological voice recognition based on the wavelet transformation and Gaussian mixes model (GMM). Considering the voice pronunciation mechanism, the different performances of the abnormal voice and the normal voice in the field of frequency, the paper proposed a new method for extracting characteristics that was Entropy Coefficient Based on De-noise, Decomposition of Multi-scale Analysis (ECDDMA) using the wavelet decomposition to find the pathological voice's characteristics. And we have realized the normal and abnormal voice recognition using the model of GMM.There was a database which had 242 normal voice samples and 234 abnormal in this paper,and all of these pathological voice samples from clinical. We would be randomly select 80 samples respectively from normal voices samples and abnormal for training, and the rest samples for testing. The basic theory of the wavelet transformation and the wavelet de-noising would be introduced in detail, and we got the ECDDMA extraction process and the algorithm. The experimental result indicated that, the parameters of ECDDMA were more advantageous to the normal and abnormal voice recognition than the traditional MFCC and the dynamic characteristic which mimic the human ears non-linear characteristic with frequency, and obtained the good recognition result. And we analyzed the necessity of de-noising in the extracting process of characteristics, the influence of the model mix number selection and the wavelet decomposition number selection. Because the extraction of ECDDMA was based on the whole frequency range, some characteristics had not the function to improve the identification rate, instead of leading to the dropping of recognition performance, and made the operation to be complex. Therefore it was necessary to carry on the characteristic selection. We selected the effective characteristics to construct the model, improved the recognition performance. In this paper we carried on the comparison to the traditional exhaustion method with the method of characteristic selection based on the neural network, the experimental result indicated that the method of exhaustive features selection was not practical, and we would be prove the superiority of the neural network characteristic method through the experiment. Finally we selected a group of 7-dimensional features vector from the original characteristics of 22-dimensional features (ECDDMA+ energy characteristic coefficient) with the method of neural network, and have achieved better recognition performance.Comparative analysis of recognition performance of the acoustic parameters and ECDDMA coefficient ,try differents combination of different feature selection results,the experimental results showed that the ECDDMA coefficient aspects of the more superiority than acoustic parameters in the computer pathological voice recognition.
Keywords/Search Tags:Pathological Voice, wavelet transformation, de-noising, Mel Frequency Cepstrum Coefficient(MFCC), ECDDMA Coefficient, Gaussian Mixture Model (GMM)
PDF Full Text Request
Related items