Font Size: a A A

Minimum Bayes-risk automatic speech recognition

Posted on:2002-06-19Degree:Ph.DType:Thesis
University:The Johns Hopkins UniversityCandidate:Goel, VaibhavaFull Text:PDF
GTID:2468390011997309Subject:Computer Science
Abstract/Summary:
Automatic speech recognition (ASR) systems find use in diverse tasks such as human to machine dialogue, language acquisition by non-native speakers, indexing and retrieval of multi-lingual audio information, and even assistance to individuals with speech impairment. In observing the variety of applications to which ASR is put, the question arises whether a uniform ASR architecture is equally useful for all scenarios. It may be possible to improve application specific performance of the ASR systems by adopting a framework that allows construction of task dependent recognizers. It is the pursuit of this hypothesis that we present in this dissertation.; We first argue that the conventional ASR systems that minimize expected sentence error rate are suboptimal for many tasks of interest. We then describe the framework of minimum Bayes-risk (MBR) classification. A prefix tree based multi-stack A-star search algorithm on recognition lattices is described to implement the MBR recognizers. We provide experimental results showing that the MBR recognizers yield better recognition accuracy than the conventional maximum a-posteriori probability (MAP) recognizer in ASR tasks of word transcription, identification of keywords, and named entity extraction. We also provide experimental results for the task of gene identification from genomic DNA to demonstrate the applicability of MBR classifiers to non-ASR tasks.; To simplify the implementation of the MBR recognizers, a segmental MBR classification scheme is presented. It decomposes a complex MBR recognizer into a sequence of simple recognizers by segmenting the recognition lattice or N-best lists. An interesting outcome of segmental MBR formulation is the derivation of ROVER and other voting techniques as its special cases. Our analysis shows inherent limitations of these voting procedures due to their implicit approximations and assumptions. To alleviate some of these limitations, we present two new procedures derived under the segmental MBR framework.; One of the shortcomings of our implementation of the MBR decoders is the requirement of a lattice or an N-best list, which in turn requires a recognition pass before we could implement MBR decoders. We present our preliminary formulations towards a first pass MBR recognition strategy which processes acoustic data directly. We also discuss extensions of minimum classification error training and other discriminative training methods in the segmental MBR framework.
Keywords/Search Tags:MBR, Recognition, ASR, Minimum, Speech, Tasks, Framework
Related items