Font Size: a A A

Study And Improve On The Mongolian Speech Recognition System

Posted on:2010-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:L FeiFull Text:PDF
GTID:2178360278967624Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Speech Recognition is an important research subject for the field of pattern recognition, its development will deeply influence the future of human-computer interface. Speech Recognition is a wide range of cross-disciplinary, it has close relationship with Acoustics, Linguistics, Artificial Intelligence, Digital Signal Processing as well as Pattern Recognition etc.Mongolian is an agglutinative language. Mongolian is the main national language of Inner Mongolia. Mongolian Speech Recognition research is still in earlier phase. Low recognition rate and noise-free environment are the problem of a Mongolian speech recognition technology breakthrough. Based on Mongolian own characteristics, this paper further improvement and optimization for the acoustic model and language model of Mongolian Speech Recognition System.This paper established context dependent phonetic model for Mongolian speech recognition system, and then we did parameter tying with Bottom-up Approach and Top-down Approach to compare the recognition rate. Secondly, we established CHMM Gaussian mixture model and multiple data stream SCHMM model of Mongolian speech recognition system, and compare the performance of these models. Finally, we established trigram language model, and smoothing methods of LM are experimented and compared, to improve the recognition rate.On this basis, we tool through the HTK and CMU_Cam_Toolkit tools, adopt Triphone, decision tree strategy, trigram language model and multiple data stream SCHMM, done a lot of experiments for test sets, the sentence recognition rate reached 74.78%, the word recognition rate reached 96.96%. System performance has been optimized, and system recognition rate has been improved.
Keywords/Search Tags:Speech Recognition, Hidden Markov Model, Triphone, Decision Tree, Acoustic Model, Language Model
PDF Full Text Request
Related items