Font Size: a A A

Chinese Continuous Speech Recognition Based On Sphinx

Posted on:2011-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360305471897Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Chinese continuous speech recognition has a good prospect of application and far-reaching research value. Because of the shorter pronunciation, easier confusion, as well as various kinds of dialects of Chinese language, Chinese continuous speech recognition is more difficult than English continuous speech recognition, and it becomes a hot and challenging topic in the field of speech recognition. Based on Sphinx speech recognition system which is developed by Carnegie Mellon University, and combined with the characteristics of Chinese pronunciation, this thesis is making a basic research on the medium vocabulary and speaker-independent Chinese continuous speech.The essence of Chinese continuous speech recognition is to search a best word sequence corresponding to vioce input in a state space of many layers of knowledge and definitions such as Chinese phonetics and linguistics etc. It also needs related knowledge and technology of feature extraction, the acoustic model, language model, search algorithms. Sphinx continuous speech recognition system represents a higher level. In this thesis, a Chinese continuous speech recognition system is built by combining the theoretical knowledge of Sphinx with the characteristics of Chinese pronunciation. Acoustic model and speech recognition theory is the basis for building speech recognition systems. A whole continuous speech recognition system includes four parts: feature extraction, acoustic model, language models and search algorithms, and the thesis is carried out according to them. This thesis introduces the historical development and the theoretical knowledge of basic composition of Chinese continuous speech recognition at first, and makes a detailed analysis on MFCC Feature extraction. Then through a in-depth study on acoustic model training tool– Sphinxtrain and language model training tool– Cmucmltk of Sphinx system, this thesis modifies the relevant parameters, and trains the acoustic model and language model suitable for Chinese speech recognition. After training models, it focuses on the search algorithm in decoding end, and builds the Chinese continuous speech recognition system by combining with recognition engine Pocketsphinx. In the last part, the thesis verifies the effectiveness of this system by experiments and data analysis.This thesis has mainly designed two systems: the first is the build of Chinese continuous digital system in which the sentence recognition rate is up to 90%, word recognition rate is as high as 97.2%. This system takes CASIA Chinese digital dpeech library as the input of models training; the second is the establishment of a medium-vocabulary continuous Chinese speech recognition system, which has a poorer recognition performance than the former one, in which the sentence recognition rate is 70%, word recognition rate is 96.7%. The latter takes CASIA Chinese speech testing library as the input of models training. All the data has shown the effectiveness of the system.
Keywords/Search Tags:Chinese continuous speech recognition, Sphinx, Feature extraction, Acoustic model, Language model, Search algorithm
PDF Full Text Request
Related items