Font Size: a A A

Two-dimensional hidden Markov models for speech recognition

Posted on:1999-01-02Degree:Ph.DType:Thesis
University:The University of ChicagoCandidate:Li, JiayuFull Text:PDF
GTID:2468390014971403Subject:Statistics
Abstract/Summary:
In this thesis, we present a new two-dimensional variation of the hidden Markov model (HMM) for speech recognition. We improve the traditional HMM in three respects.;(1) We improve the hidden state modeling of the HMM. We do this by first adopting a parametric model that is better compared to the truncated geometric distribution implied by the HMM for the duration of a state given that the state is not omitted. We then extend the state omission probability to any subset of states, providing a global view of the state process.;(2) We introduce Markov random field (MRF) into the HMM. The MRF model improves the weak temporal dependence specification given by an HMM. We advocate two-dimensional models for speech, because speech data representations are almost exclusively two-dimensional: one dimension is time, and the other is frequency (or something of the sort). In our new model, the observation process is modeled by an MRF, where the local dependences in both the time and the frequency domain are modeled by the conditional distribution of each individual component in the time-frequency plane, given its nearest neighbors.;(3) We introduce a local "smoother" into the specification of the conditional mean (i.e. the mean of the MRF) of the observation process given the state process, to reduce the discontinuity observed in simulation of an HMM. Thus the state-dependent mean spectrogram, produced by mapping the state sequence to the observations through the state-dependent mean vectors, is combined with the linear local smoother to produce the mean of the MRF.;An efficient algorithm based on the segmental K-means algorithm and the iterative conditional mode (ICM) method is proposed to estimate the parameters of our new model.;The model is implemented and applied to the recognition of segmented digits. Experiments show that the new model improves classification error by 23% compared to the HMM.
Keywords/Search Tags:Model, HMM, Two-dimensional, Speech, Hidden, Markov, New, MRF
Related items