Font Size: a A A

Research Of Speaker-independent Continuous Chinese Digit String Speech Recognition

Posted on:2008-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:J FengFull Text:PDF
GTID:2178360242467250Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Speech recognition has achieved high performance in lab. Our nation's research on speech recognition started in 1950's and is developing rapidly in recent years and is beginning to apply in practical systems. However, in practical use, due to the existence of background noise, dialect and tongue, speech recognition systems are not used very widely. Especially for the case of embedded systems which are used in complex environment, it is more important to solve these problems.Since the research on speech recognition of the lab has just started, the foundation of large vocabulary speech recognition system need to create dictionary which requires a lot of linguistic knowledge and also need a big speech database, the paper mainly studies speaker-independent continuous Chinese digit strings speech recognition, including research on adaptive endpoint detection, contribution of Mel frequency cepstrum coefficient(MFCC) components to recognition rate, choice of numbers of HMM status and size of train set.After study on traditional endpoint detection, find that hypothesis of using a fixed coefficientα(α=1) is not suitable when the signal noise ratio (SNR) changes. It is necessary to add adaptation process when the system begins to work using step to step approaching method. The result of experiment shows that after adaptation, the system can apply in lower SNR environment.MFCC is an effective feature in speech recognition. In traditional use, since the first two components of MFCC reflect the amplitude of waveforms and are negative for the result of recognition, these two components are abandoned. However, after experiments, it is found that although these two components do little contribution to distinguish between digits, they are useful in distinguishing between speech and noise. So, these components can be used in the stage of endpoint detection.Experiments are also done in the choice of hidden Markov model(HMM) status number and size of train set and it is found that it is reasonable to set the number of HMM status to be 5 and the size of train set to be 30.
Keywords/Search Tags:Speech Recognition, Endpoint Detection, Adaptive, Mel Frequency Cepstrum Coefficient (MFCC)
PDF Full Text Request
Related items