Font Size: a A A

Augmented speech recognition

Posted on:2008-03-20Degree:M.A.ScType:Thesis
University:Carleton University (Canada)Candidate:Guo, Hua JianFull Text:PDF
GTID:2448390005976445Subject:Engineering
Abstract/Summary:
Two types of non-acoustic information sources, namely myoelectric signals (MES) and the general electromagnetic motion sensor (GEMS) signal, are investigated to overcome the limitations of conventional automatic speech recognition (ASR) systems; in particular, these limitations include degradation in noisy environments and reliance on the single acoustic signal modality.; A new training algorithm called the approximated maximum mutual information (AMMI) is demonstrated to improve the accuracy of MES ASR using hidden Markov models. Results show that AMMI training consistently reduces the error rates compared to conventional maximum likelihood training. Increases in accuracy of approximately 7% are observed at the empirically optimal operating point. A new ASR methodology using the GEMS signal is also presented. Classification accuracy of 68.9% is obtained for a tenword vocabulary, confirming the presence of speech information in the GEMS signal.; Two types of multimodal ASR systems are developed combining MES, the GEMS signal, and the acoustic signal. Type I combined the output of multiple classifiers, each operating on a single signal modality. Type II combined the three modalities in a single classifier. Evaluation of the multimodal system under acoustic noisy conditions shows that performance of the multimodal systems was superior to the unimodal acoustic ASR system in noisy environment. An acoustic ASR system was demonstrated to have a classification error as high as 55.8% at an SNR of 15 dB, whereas the optimal multimodal ASR system classification error remained below 5.1% for the same range of noise.
Keywords/Search Tags:ASR system, GEMS, Signal, MES, Acoustic, Speech, Multimodal
Related items