Font Size: a A A

Development of an approach to language identification based on language-dependent phone recognition

Posted on:1996-06-13Degree:Ph.DType:Thesis
University:Oregon Graduate Institute of Science and TechnologyCandidate:Yan, YonghongFull Text:PDF
GTID:2465390014485609Subject:Computer Science
Abstract/Summary:
The goal of Language Identification (LID) is to quickly and accurately identify the language being spoken. Although the differences among different (spoken) languages are generally large by any sensible measure, automatic language identification remains a major challenge (perhaps indicating the immaturity of the field of speech processing).; Current language identification systems vary greatly in terms of information utilization and system complexity. Understanding all of these approaches in a unified framework is one of the major challenges in automatic language identification. In this dissertation we provide a partial unification by studying the roles of acoustic, phonotactic and prosodic information in a particular system for language identification.; A comparative study was first conducted on a common two-language task (English and Japanese) to get a grasp of these issues. The results from the comparative experiments were used as basis for the development of a general purpose language-identification base-line system.; Within this framework, two novel LID information sources (backward language model and a context-dependent duration model) were introduced. These two models increased language modeling accuracy at a moderate cost in terms of training data. Also, a novel optimization method was introduced to enhance the discrimination between different languages. These methods led to substantial improvements in system performance. Preliminary studies into channel normalization, conversational speech and system adaptation to new languages were also pursued.; A general purpose LID software tool kit was developed based on the algorithm developed in this thesis work. The final LID system developed attained correct rates of 91% (45-second segments) and 77% (ten-second segments) on a commonly used nine-language task. This is one of the best results reported to date on these tasks.
Keywords/Search Tags:Language, LID
Related items