Online Learning of Large Margin Hidden Markov Models for Automatic Speech Recognition

Posted on:2012-09-04

Degree:Ph.D

Type:Thesis

University:University of California, San Diego

Candidate:Cheng, Chih-Chieh

Full Text:PDF

GTID:2468390011958828

Subject:Artificial Intelligence

Abstract/Summary:

Over the last two decades, large margin methods have yielded excellent performance on many tasks. The theoretical properties of large margin methods have been intensively studied and are especially well-established for support vector machines (SVMs). However, the scalability of large margin methods remains an issue due to the amount of computation they require. This is especially true for applications involving sequential data.;In this thesis we are motivated by the problem of automatic speech recognition (ASR) whose large-scale applications involve training and testing on extremely large data sets. The acoustic models used in ASR are based on continuous-density hidden Markov models (CD-HMMs). Researchers in ASR have focused on discriminative training of HMMs, which leads to models with significantly lower error rates. More recently, building on the successes of SVMs and various extensions thereof in the machine learning community, a number of researchers in ASR have also explored large margin methods for discriminative training of HMMs.;This dissertation aims to apply various large margin methods developed in the machine learning community to the challenging large-scale problems that arise in ASR. Specifically, we explore the use of sequential, mistake-driven updates for online learning and acoustic feature adaptation in large margin HMMs. The updates are applied to the parameters of acoustic models after the decoding of individual training utterances. For large margin training, the updates attempt to separate the log-likelihoods of correct and incorrect transcriptions by an amount proportional to their Hamming distance. For acoustic feature adaptation, the updates attempt to improve recognition by linearly transforming the features computed by the front end. We evaluate acoustic models trained in this way on the TIMIT speech database. We find that online updates for large margin training not only converge faster than analogous batch optimizations, but also yield lower phone error rates than approaches that do not attempt to enforce a large margin.;We conclude this thesis with a discussion of future research directions, highlighting in particular the challenges of scaling our approach to the most difficult problems in large-vocabulary continuous speech recognition.

Keywords/Search Tags:

Large, Speech, Recognition, Models, ASR, Online

Related items

1	Establishment Of Mandarin Large Vocabulary Continuous Speech Recognition Based On Hybrid ANN/HMM Models
2	Large Vocabulary Continuous Speech Recognition Research Based On HTK
3	Graphical models for large vocabulary speech recognition
4	Exploring End-to-end Speech Recognition Models
5	Research On Online Tibetan Speech Recognition System
6	Large margin training of acoustic models for speech recognition
7	Research On Speech Phoneme Recognition Based On Deep Learning
8	People Independent Chinese Speech Recognition Based On HMM And ANN
9	Research On The Acoustic Models And Implemention On The Keyword Recognition System
10	Research And Implementation Of Speech Recognition Based On HMM/BP