Font Size: a A A

New approaches to automatic language identification

Posted on:1997-06-20Degree:Ph.DType:Thesis
University:Rensselaer Polytechnic InstituteCandidate:Marcheret, EtienneFull Text:PDF
GTID:2468390014483325Subject:Engineering
Abstract/Summary:
Over the past three years there has been considerable effort put into development of a language identification system by both government agencies and telecommunication companies. Although the needs of these two entities is quite different the impetus for both government and industry to seriously pursue the development of an automatic language identification system came from the availability of a large corpus of multi-lingual speech data. Beginning in the late 1980's and continuing today the Oregon Graduate Institute (OGI) has been collecting multi-lingual speech data. To date the OGI database consists of spontaneous and fixed-vocabulary utterances in 11 languages. In 1993 the National Institute of Standards and Technologies (NIST) designated the OGI database as the standard for evaluating language identification algorithms. NIST has since been coordinating the evaluation process and at last count (June 1995) ten research sites in the United States (AT&T, BBN (Bolt, Beranek, and Newman), Dragon Systems, ITT (International Telephone and Telegraph), Lockheed-Sanders, MIT laboratory for computer science, MIT Lincoln Laboratory, Natural Speech Technologies, OGI, and RPI) have been involved in these ongoing evaluations. The availability of a standard multi-lingual speech database has led to a proliferation of approaches to language identification, and to the open exchange of ideas between researchers making the possibility of an accurate and fast automatic language identification system a reality in the not too distant future.; We can only speculate on the uses that government agencies would find for an automatic language identification system. One possibility could be as a front end for multi-lingual speech recognition systems, directing telephone calls to the speech recognition system of the matching language. Industry uses are more obvious, telephone companies will be better equipped to handle foreign language calls if an automatic language identification system can be used to route the call to an operator fluent in that language. An increase in reported cases of emergency response operators being unable to understand the language of a distressed caller has led to AT&T introducing the Language Line Interpreter Service. This service uses trained human interpreters to handle approximately 140 languages, the advent of an automatic language identification system could greatly assist these human operators.; The primary goal of this thesis is the development of an automatic language identification system. There are also contributions to phoneme recognition with the development of a continuous observation density hidden semi-Markov model (HSMM). The usefulness of Multi-sensor decision theory as applied to language identification is investigated. The failsafe property inherent in a multi-sensor structure is shown to effectively combat the low phoneme recognition accuracies present in speech recognition systems. Random walk theory is developed for a language identification system. The random walk classifiers are shown to outperform the classical Bayesian classifiers. The performance of the language identification system developed in this thesis has been evaluated by NIST. The phoneme recognition accuracy produced by the newly developed continuous observation probability HSMM is compared against a previously existing discrete observation probability HSMM, and against commercially available hidden Markov models.
Keywords/Search Tags:Language identification, HSMM, Multi-lingual speech, Development, OGI
Related items