Font Size: a A A

Research On Practical Chinese Speech Recognition System And Its Key Technologies

Posted on:2007-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y M ChenFull Text:PDF
GTID:2178360275970004Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Communicating with machines by speaking and hearing will lead to great convenience and will have tremendous and profound impact on daily life, this is why speech recognition technologies have been paid great attention by 1950s. Chinese language recognition is especially useful'cause inputting Chinese by keyboard and mouse is so inconvenient. This paper present a complete project for Chinese speech recognition. Classical methods are studied and some improvement are made in Each part of the system, especially in the following three areas:The behaviors of Ending points detection will greatly affect the performance of isolated words or connected words recognition. The paper studied the performance of classical Two-Level Ending Point Detection, and using auto-adaptation with limited boundaries when determining the level gates. A further detection algorithm based on Chinese character is also presented.One of the main reason that lead to errors in isolated words recognition is the existence of"Out of Vocabulary"signals, which is caused causally by speakers or environment. Several method of OOV rejection are studied in the paper, and the final solution is an time-normalized, model's self-match performance concerned neutral network method.Another problem studied is the establish of model-base in Speaker Independent system. An algorithm is presented to greatly reduce the volume of model-base without affect the recognition performance.The final part of the paper presents an Isolated words recognition application: Bookshop Guiding System, and a keyword spotting application: Vocal Controled Virtual Pat. These two systems are based on HTK of Cambridge University, and use the technologies and improvements discussed above. They use Window API function to get vocal data from microphone, transfer the data to MFCC paramount, and use HMM method in HTK to do the recognition. The HMM recognition results then feed to OOV rejection module, if the results are judged as OOV, no action will be made. The construction of model-base is based on manually recorded isolated words database and use model-base volume reduction method discussed above.
Keywords/Search Tags:speech recognition, Ending point detection, OOV rejection, model-base, HTK
PDF Full Text Request
Related items