Font Size: a A A

Discriminative training of language models for speech recognition

Posted on:2011-06-04Degree:M.ScType:Thesis
University:York University (Canada)Candidate:Magdin, VladimirFull Text:PDF
GTID:2448390002453663Subject:Artificial Intelligence
Abstract/Summary:
This thesis presents a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition (LVCSR). Language models play an important role in speech recognition because they help to constrain the potentially vast search space of possible hypotheses. The discriminative training algorithm introduced in this thesis aims to estimate a standard n-gram language model in order to increase recognition rates in speech recognition tasks.;Experimental results on the Speech in Noisy Environments 1 (SPINE1) speech recognition corpus have shown that the proposed discriminative training method can outperform the conventional discounting-based maximum likelihood estimation methods. A relative reduction in word error rate of over 2.5 percent has been observed on the SPINE1 speech recognition task.;Two different formulations of the algorithm are presented. One uses maximum mutual information estimation (MMIE), and the other uses large margin estimation (LME) to build an objective function that involves a metric computed between correct transcriptions and their competing hypotheses, which are encoded as word graphs generated from the Viterbi decoding process. The nonlinear MMIE/LME objective functions are approximated by linear functions via an auxiliary function that is inspired by the Expectation-Maximization (EM) algorithm. Following the linear approximation, the non-linear discriminative training problem of n-gram language models is converted into a linear programing problem, which can be efficiently solved by widely-available convex optimization tools.
Keywords/Search Tags:Language models, Speech recognition, Discriminative training, Algorithm
Related items