Minimum Bayes-risk automatic speech recognition

Posted on:2002-06-19

Degree:Ph.D

Type:Thesis

University:The Johns Hopkins University

Candidate:Goel, Vaibhava

Full Text:PDF

GTID:2468390011997309

Subject:Computer Science

Abstract/Summary:

Automatic speech recognition (ASR) systems find use in diverse tasks such as human to machine dialogue, language acquisition by non-native speakers, indexing and retrieval of multi-lingual audio information, and even assistance to individuals with speech impairment. In observing the variety of applications to which ASR is put, the question arises whether a uniform ASR architecture is equally useful for all scenarios. It may be possible to improve application specific performance of the ASR systems by adopting a framework that allows construction of task dependent recognizers. It is the pursuit of this hypothesis that we present in this dissertation.; We first argue that the conventional ASR systems that minimize expected sentence error rate are suboptimal for many tasks of interest. We then describe the framework of minimum Bayes-risk (MBR) classification. A prefix tree based multi-stack A-star search algorithm on recognition lattices is described to implement the MBR recognizers. We provide experimental results showing that the MBR recognizers yield better recognition accuracy than the conventional maximum a-posteriori probability (MAP) recognizer in ASR tasks of word transcription, identification of keywords, and named entity extraction. We also provide experimental results for the task of gene identification from genomic DNA to demonstrate the applicability of MBR classifiers to non-ASR tasks.; To simplify the implementation of the MBR recognizers, a segmental MBR classification scheme is presented. It decomposes a complex MBR recognizer into a sequence of simple recognizers by segmenting the recognition lattice or N-best lists. An interesting outcome of segmental MBR formulation is the derivation of ROVER and other voting techniques as its special cases. Our analysis shows inherent limitations of these voting procedures due to their implicit approximations and assumptions. To alleviate some of these limitations, we present two new procedures derived under the segmental MBR framework.; One of the shortcomings of our implementation of the MBR decoders is the requirement of a lattice or an N-best list, which in turn requires a recognition pass before we could implement MBR decoders. We present our preliminary formulations towards a first pass MBR recognition strategy which processes acoustic data directly. We also discuss extensions of minimum classification error training and other discriminative training methods in the segmental MBR framework.

Keywords/Search Tags:

MBR, Recognition, ASR, Minimum, Speech, Tasks, Framework

Related items

1	Integration of multiple knowledge sources in speech recognition using minimum error training
2	Research On Noise Robust Methods In Mandarin Word Recognition
3	Effects of a novel right brain intervention on stuttering in familiar and structured speech tasks
4	A computational framework for exploring the role of speech production in speech processing from a communication system perspective
5	Research On Speech Recognition In Noisy Environment
6	Quadratic Time-Frequency Distribution Based Speech Signal Classification And Verification
7	Research On Continuous Speech Recognition Based On A Hybrid HMM/SVM Framework
8	The Application Of Feature Compensation Method Based On Probability Model In Speech Recognition
9	Anti-noise Technology Combined Denoising Method Based Speech Recognition Studies
10	Computational auditory scene analysis and robust automatic speech recognition