Font Size: a A A

Mixtures of inverse covariances: Covariance modeling for Gaussian mixtures with applications to automatic speech recognition

Posted on:2005-12-18Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Vanhoucke, VincentFull Text:PDF
GTID:2458390008994669Subject:Engineering
Abstract/Summary:
Gaussian mixture models (GMM) are widely used in statistical pattern recognition for a variety of tasks ranging from image classification to automatic speech recognition. Because of the large number of parameters devoted to representing Gaussian covariances in these models, their scalability to problems involving a large number of dimensions and a large number of Gaussian components is limited. In particular, this shortcoming of Gaussian mixture models affects the accuracy of real-time speech recognition systems by limiting the complexity of the mixtures used for acoustic modeling.; This thesis addresses the scalability problems of Gaussian mixtures through a class of models, collectively called “mixtures of inverse covariances” or MIC, which approximate the inverse covariances in a Gaussian mixture while significantly reducing both the number of parameters to be estimated, and the computations required to evaluate the Gaussian likelihoods. The MIC model scales well to problems involving large number of Gaussians and large dimensionalities, opening up new possibilities in the design of efficient and accurate statistical models. In particular, when applying these models to acoustic modeling for real-world automatic speech recognition tasks, they significantly improve both the speed and accuracy of a state-of-the-art speech recognition system.
Keywords/Search Tags:Recognition, Gaussian, Automatic speech, Mixture, Models, Large number, Inverse, Covariances
Related items