Font Size: a A A

Speech Recognition Method Based On Hidden Markov Models

Posted on:2006-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:X L ChenFull Text:PDF
GTID:2208360155466850Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speech Recognition can be thought of not only as an process in which machine extracts character symbols from the speech signal, but also an interdisciplinary which has close relationship with acoustics, linguistics, artificial intelligence, digital signal processing as well as pattern recognition etc.After nearly 50-year's development, speech recognition technology now can solve the large vocabulary, speaker-independent, continuous recognition problems, and Chinese speech recognition has reached or equaled the world-class level. Aiming at the characteristic of Chinese character pronunciation, this article mainly studied the medium vocabulary, speaker-independent, isolated Chinese character speech recognition, based on DHMM theory.At first, based on the analysis of speech signal, this thesis modified the arithmetic of short-time zero crossing ratio, and also modified end-point detection method using two factors of amplitude and short-time zero crossing ratio. Then the thesis studies the pronunciation of Chinese character, and analyses the characteristic of initial consonant and simple or compound vowel as unit of phoneme. A method of searching transfer point to separate initial consonant and simple or compound vowel is proposed.Then, the thesis introduces feature extraction, vector quantization and related knowledge, the most important two parts of speech recognition system.At last, the speech recognition system using HMM as the method is analyzed in detail, and the question that HMM's parameter selection of speech recognition is discussed. As to speaker-independent, medium vocabulary, isolated word speech recognition, comparison was made by using DTW and DHMM two methods, and the advantage of DHMM method is proved. Also, this thesis discusses the feature extraction's influence on recognition rate and makes the conclusion that that power-difference LPCC is an advantaged parameter. Through the discussion of vector quantization parameter's choice, the result that for medium vocabulary speech recognition, the codebook capacity should be 64 or 128 is made. Also, based on thestudy of Chinese pronunciation, the disadvantage of HMM is modified and a method of two-section HMM speech recognition is proposed. The conclusion of experiment indicated that this method can reduce system's recognition time, and improve system's recognition rate as well.
Keywords/Search Tags:Speech Recognition, End-point Detection, Short-time Average Zero Crossing Ratio, Vector Quantization, Discrete Hidden Markov Model
PDF Full Text Request
Related items