Font Size: a A A

Acoustic modeling and feature selection for speech recognition

Posted on:2006-11-05Degree:Ph.DType:Thesis
University:University of Illinois at Urbana-ChampaignCandidate:Zheng, YanliFull Text:PDF
GTID:2458390005495237Subject:Engineering
Abstract/Summary:
The standard hidden Markov model (HMM) has been proved to be the most successful model for speech recognition. A most widely addressed problem of the HMM is the assumption of independent observations given the state sequence. In the past few years, a wide range of state-space models and graphical models, such as segmental models and switching linear dynamical systems, have been applied to the speech recognition task. An underlying difficulty of those proposed systems is the tradeoff of computational complexity and representation capability. This thesis presents some results and explorations which indicate that recognition performance can be improved by incorporating acoustic phonetic prior information into a nonlinear state-space model and by incorporating more discriminative measurements to a support vector machine (SVM) and HMM combined system.; The investigation of the thesis can be divided into three parts. In the first part, a nonlinear dynamic system is proposed for formant tracking. Compared to previous formant trackers depending on least squares estimation of LPC coefficients, MUSIC (Multiple Signal Classification) and ESPRIT (Estimation of Signal Parameters via Rotational Invariance Techniques) are used to improve the accuracy of formant estimation. Furthermore, a mixture of nonlinear dynamic systems is developed to improve the performance of formant tracking. In the second part, the formant tracker system is extended to perform phoneme recognition. The results indicate that the incapability of estimating the system measurement error prevents the system from performing well in the phoneme recognition tasks. In the third part, an SVM and HMM combined system is used to prove that the formant information is indeed useful to distinguish different phonemes. And the result in this part suggests that the output of the SVM can be treated as a particular case of discriminant transformation of the original acoustic space and might be useful for speech recognition.
Keywords/Search Tags:Speech recognition, Acoustic, Model, HMM, Part
Related items