Font Size: a A A

Detecting and correcting speech recognition errors during natural language understanding

Posted on:2004-01-31Degree:Ph.DType:Dissertation
University:The University of RochesterCandidate:Zollo, Teresa MaryFull Text:PDF
GTID:1468390011476187Subject:Computer Science
Abstract/Summary:
The focus of this work is to improve the ability of a spoken dialog system to identify and compensate for speech recognition errors. Rather than trying to eradicate errors coming from the speech recognizer, our work focuses on detecting, locating and correcting the errors that occur, within the natural language understanding component of a spoken dialog system.; Current spoken dialog systems operate within narrow domains, and many work by filling in slots for the information they need to achieve a specific task. Such simple systems do not require a syntactic analysis of what the user said; they can accomplish their mission by recognizing only a few key phrases. We believe that as spoken language systems become more sophisticated, they will require a more thorough analysis of the user input, rendering many current robustness strategies ineffective. By identifying implausible speech recognition hypotheses, the spoken dialog system can attempt to repair the communication breakdown, either by using stochastic methods to predict what was actually said or by using an appropriate dialog strategy.; We show that by describing the expected structure of spoken turns in human-computer practical dialog and formalizing the structure by means of context-free grammar rules used by a traditional bottom-up chart parser, we can achieve 92.1% accuracy in the task of detecting erroneous speech recognizer output based solely on the chart generated during parsing, an improvement of 18.2 percentage points over the majority-class baseline. Furthermore, we can reliably locate the start index of errors within misrecognized strings using the chart and domain-specific word bigram models. We developed and implemented algorithms that use the predicted error start location together with the word bigram models, phonetic similarity and the recognized string to generate correction hypotheses.
Keywords/Search Tags:Spoken dialog system, Speech recognition, Errors, Detecting, Language
Related items