Font Size: a A A

Pathological Voice Recognition Study By Wavelet Domain Multifractal And Energy Spectrum Parameters

Posted on:2017-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ChangFull Text:PDF
GTID:2284330488960699Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
With the development of society, the communication between people is becoming more and more frequent. Language is an important bridge for people to contact each other, but hoarseness and weakness of voice caused by voice diseases have seriously affected people’s quality of life and social communication. Using acoustic analysis technology to study and analyze the pathological voice signal, is able to realize objective evaluation of voice quality, and has an important significance in diagnosis and treatment of laryngeal diseases.Wavelet transform can analyze the signal at different scales, and multifractal spectrum is able to finely describe the local scale behavior of the voice signal. At the same time, the wavelet transform is a time-frequency analytic method, which can effectively reflect the energy distribution of the voice in time-scale plane. Therefore, this paper mainly studies the voice wavelet multifractal spectral parameters and power spectral parameters for pathological voice recognition.Aiming at the deficiency of single fractal dimension in the description of the nonlinearity of the voice, wavelet leaders multifractal spectrum is introduced which combines fractal and wavelet and is calculated by Chhabza method, then the confidence interval of parameters is estimated by nonparametric bootstrap. Wavelet leaders multifractal spectrum can characterize the statistical distribution of local singularity of voice. There is much difference between spectrum parameters of normal voice and pathological voice. The spectrum width of pathological voice is smaller than that of the normal voice, so normal voice has more obvious multifractal characteristics. The average recognition rate is 90.66% for pathological voice when wavelet leaders multifractal spectrum parameters are used.Traditional acoustic parameters could not represent the non-stationary characteristics of the signal very well. Therefore, time-frequency analysis of pathological voice is operated in wavelet domain in this paper, then a wavelet energy spectrum parameter GCWT is proposed for the recognition of pathological voice. The multi-dimensional Gaussian mixture model of the energy spectrum is calculated along the scale axis direction, and the model parameter is utilized to identify pathological voice as the feature parameter GCWT. The parameter GCWT has a better recognition effect for pathological voice than traditional parameters, and the average recognition rate is 92.99%. Due to the high dimension of parameter GCWT, principal component analysis and locally linear embedding is adopted to reduce the dimension of the feature parameter GCWT, besides, dynamic weighted locally linear embedding(DWLLE) dimension reduction method is put forward, weakening the effect of sparse sampling on dimension reduction, preserving the geometrical property of the high dimensional space effectively and reducing parameter redundancy. After dimension reduction, the recognition rate of the parameter reaches 97.45% for pathological voice.
Keywords/Search Tags:pathological voice recognition, wavelet leaders multifractal, wavelet energy spectrum, Gaussian Mixture Model, locally linear embedding
PDF Full Text Request
Related items