Study Of Speech Recognition System For Mandarin Digit Based On HMM

Posted on:2007-04-10

Degree:Master

Type:Thesis

Country:China

Candidate:Z G Hou

Full Text:PDF

GTID:2178360182488292

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

Speech is an important tool for people communicating with machines. The technology of Automatic Speech Recognition(ASR) can let machines understand humanity's language and carry out corresponding operations and it can be widely used in many areas. Though it has been studied for many years, there are still many problems in ASR worthy to be studied.The Acoustic model of the speech-production/ speech-percept and the speech recognition theory are the basics of ASR, so each step of ASR process is analyzed in details. The improved spectrum entropy algorithm is brought forward for the endpoints detection, and the results of experiments show that robustness of the system has been improved while using this method for endpoints detection. The chosen speech feature parameters have great effects on robustness and real-time of the speech recognition system. After introducing short-time feature parameters and spectrograms, three approaches to extracting speech feature parameters such as Linear Predictive Coding(LPC), Linear Predictive Cepstrum Coefficients(LPCC) and Mel-Frequency Cepstrum Coefficients(MFCC), are discussed in details, then their distortion measure are introduced.The DTW theory and HMM theory are discussed. Their applications in recognition are analyzed through the MATLAB programs. The isolated word recognition systems based on the DTW theory for speaker-independent and speaker dependent are discussed while using different feature parameters. On the other side, the small-vocabulary speaker-independent speechrecognition system based on HMM is constructed. Different feature parameters can be chosen in the recongnition system, which has good robustness. The experiments are conducted to recognize the mandarin digitals from 0 to 9 with this system. The results show that 12-dimension LPCC is the most effective feature, but the recognition rate of 26-dimension MFCC is about 10% higher than that of 12-dimension LPCC.

Keywords/Search Tags:

Automatic Speech Recognition(ASR), Linear Predictive Cepstrum Coefficients(LPCC), Mel-Frequency Cepstrum Coefficients(MFCC), Dynamic Time Warping(DTW), Hidden Markov Model(HMM)

PDF Full Text Request

Related items

1	Study Of Mandarin Digit Speech Recognition Algorithm Based On HMM Model
2	Study On The System Of Mandarin Digit Speech On The Basis Of DSP
3	The Recognition Model Research Based On Whole Acoustic Structure Features Of Speech Unit
4	Study On Isolated Mandarin Speech Recognition Technology
5	Research And Implementation Of Speech Recognition Algorithm Based On DSP
6	The Speech Recognition System Based On The HMMNN Model
7	Speech Recognition Algorithm
8	Research On Phonetic Similarity Evaluation Algorithm
9	Research On The Automatic Classification Of Cough
10	Research And Implement On Isolated Mandarin Speech Recognition