Font Size: a A A

Study On Isolated Mandarin Speech Recognition Technology

Posted on:2010-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:S X BaiFull Text:PDF
GTID:2178360278458930Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Man has long dreamed of having a machine that can "listen to" and "speak" human languages. This ideal of man, in the information era, is gradually becoming a reality with the state-of-the-art technology in speech recognition, the task of which is to solve the problem of machine understanding the human speech.Isolated-word speech recognition is the foundation of further deep research on speech recognition, which is easy to implement, with its technique mature and its application prospect broad. In this paper, the technique of small-vocabulary speaker-independent isolated-word speech recognition is analyzed and researched.Firstly, this paper focuses on the introduction of the fundamentals of speech recognition. The components and principles of a typical speech recognition system is presented in simple, then the speech signal preprocess, the endpoint detection feature parameters and the speech recognition methods are analyzed, further the extraction of Mel frequency cepstrum coefficients (MFCC) feature is discussed in detail.Secondly, the isolated-word endpoint detection algorithms are mainly researched. Based on the endpoint detection algorithms of information entropy, band-partitioning spectral entropy and variance of frequency, revisions and ameliorations are made on the original algorithms and corresponding improved endpoint detection algorithms are proposed, the simulation results under the same SNR conditions show that the detection accuracy rate of the improved endpoint detection algorithms is significantly higher than that of the traditional threshold detection algorithm based on energy and zero-crossing, wherein the detection performance of the improved variance of frequency based algorithm is the best.Finally, speech recognition methods based on dynamic time warping (DTW) and hidden Markov model (HMM) are deeply studied. The fast DTW algorithm has low complexity and is very suitable for small-vocabulary speaker-dependent speech recognition. The experimental data shows that its correct identification rate is almost up to 100%. For speaker-independent speech recognition, HMM-based mainstream identification methods is used in this paper, the specific issues of continuous HMM applied to speech recognition are also discussed. Ultimately, combining the improved endpoint detection algorithms with continuous HMM recognition method, an average recognition rate of up to 92% is achieved in the recognition of self-built Chinese figures voice database.
Keywords/Search Tags:isolated-word recognition, speaker-independent, endpoint detection, Mel frequency cepstrum coefficients, dynamic time warping, hidden Markov model
PDF Full Text Request
Related items