Font Size: a A A

A study on the use of conditional random fields for automatic speech recognition

Posted on:2011-03-18Degree:Ph.DType:Dissertation
University:The Ohio State UniversityCandidate:Morris, Jeremy JFull Text:PDF
GTID:1468390011472073Subject:Computer Science
Abstract/Summary:
Current state of the art systems for Automatic Speech Recognition (ASR) use statistical modeling techniques such as Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) to recognize spoken language. These techniques make use of statistics derived from the acoustic frequencies of the speech signal. In recent years, interest has been rising in the use of phonological features derived from these acoustic frequency features in addition to, or in place of, the acoustic frequency features themselves. These phonological features are derived from the manner that speech is physically produced in the vocal tract of the speaker, rather than models of how speech is heard by the listener.;Integrating phonological features into ASR models presents new challenges. The mathematical assumptions made to build current models may work well for features derived from acoustic frequencies, but do not necessarily fit phonological features as nicely. Explorations into how to alter the mathematical models to allow for this new type of input feature is an ongoing area of ASR research. This dissertation examines the use of the statistical model known as a Conditional Random Field (CRF) for ASR using phonological features. CRFs are statistical models of sequences that are similar to HMMs, but CRF models do not make any assumptions about the independence or interdependence of the data being modeled.;This dissertation provides (1) a CRF-based pilot system is able to achieve superior performance in a phonetic recognition task to a comparably configured HMM model, and achieve this performance with many fewer parameters, (2) an extension of this model to create new features for an HMM-based system for word recognition, and (3) a fully developed system for word recognition using CRFs.
Keywords/Search Tags:Recognition, Speech, Features, Models, ASR, System
Related items