Font Size: a A A

Automatic language identification with recurrent neural networks

Posted on:1998-11-15Degree:D.ScType:Dissertation
University:University of Massachusetts LowellCandidate:Braun, Jerome JFull Text:PDF
GTID:1468390014977317Subject:Computer Science
Abstract/Summary:
Automatic Language Identification (LID) means the capability of a machine to determine the natural language from a spoken utterance. LID is an important domain in Speech Processing and its significance is growing. As a basic research area, LID is of interest as a mechanism automating one of the capabilities of the human brain. Practical applications of LID include systems that must recognize the talker's language within their primary functionality. LID is an important enabling technology that can augment and enhance speech recognition facilities, e.g., in multi-lingual multimedia and translation systems. Approaches to LID include, among others, Hidden Markov Modeling techniques, phonotactics, prosody, and Large Vocabulary Continuous Speech Recognition (LVCSR). In spite of a surge in LID efforts during recent years, Automatic Language Identification remains an open research area. While some approaches offer solutions to particular application scenarios, this dissertation is concerned with a general, essential LID task (i.e., the LID without recognition capabilities at word-level and above), exploiting general, language-related, speech phenomena.; In this dissertation, a novel approach to the essential Automatic Language Identitication is proposed. The Recurrent Neural Network (RNN) architecture is proposed as the fundamental LID mechanism. The motivation for the RNN-based approach (as opposed to feedforward networks, e.g., MLP) includes addressing the long-term intra-utterance context, proposed as a critical element for the essential LID. Our approach also postulates a non-uniform distribution of LID-specific information, and introduces the concept of Perceptually Significant Regions (PSRs) that contain elevated levels of such information within the utterance. Our approach proposes a novel method called Perceptually Guided Training (PGT) for exploitation of this non-uniformity. The developmental and experimental aspects of this research include the LIREN/PGT (Language Identification with REcurrent Neural networks and PGT) environment. The LID training experiments show the efficacy of the PGT method by demonstrating improvement of the training process behavior. This research also includes the investigation of a number of other issues in LID training, and it proposes a number of algorithmic enhancements related to the LID Recurrent Neural Network training.
Keywords/Search Tags:LID, Language identification, Recurrent neural, Automatic language, Training
Related items