Font Size: a A A

Research And Application Of Search Algorithm For Continuous Speech Recognition

Posted on:2003-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:L L ChenFull Text:PDF
GTID:2168360095961045Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Considerable progress has been made in continuous speech recognition (CSR) technology over last decade, which promotes various successful applications. Today we are entering a new age of mobile internet and e-commerce, and new requirements from embedded system and mobile communication need more rapid and low-cost CSR system. Therefore, real-time performance of CSR system, which depends on the efficiency of search algorithm, has been one of challenging area of speech recognition research. The thesis mainly deals with the theory and application of search algorithm for CSR.The essential of CSR is search in a state space defined by various knowledge sources, such as phonetics and linguistics. Viberbi beam search algorithm based on dynamic programming has become increasingly popular in CSR, with which knowledge sources can be efficiently organized to constrain search space and different pruning strategies can be easily combined. In this thesis, we discuss the principles and implementation of Viterbi beam search algorithm in depth, and discuss the state level pruning, word level pruning and maximum model pruning. To verify Viterbi beam search algorithm and pruning strategies, we set up a small English CSR system called Ask The Way (ATW). The experiment results show that ATW can run in near real time using a 200 MHz Pentium CPU with 64MB and the recognition rate is above 97%.The thesis also analyzes the limits of Viterbi beam search algorithm. Since Viterbi beam search use a fixed pruning threshold for the whole search process neglecting the large variation in ambiguity through the search process, much computational resources is wasted. In addition, computations of Gaussian mixture densities take up a major portion of overall recognition time. In this thesis, we improve Viterbi beam search algorithm from tow aspects. On the one side, we present a new adaptive Viterbi beam search algorithm referred to as adaptive Viterbi beam search algorithm based on variation of active model numbers. Compared with standard Viterbi beam search algorithm, the adaptive algorithm that we present reduces recognition time by 35.56%. On the other side, we use nearest neighbor approximation to calculate Gussian mixture densities, which can reduce recognition time by 6.67% compared with standard Viterbi beam search algorithm. Further more, we improve the nearest neighbor approximation method by calculate mixtures ordered by likelihood of being the best scoring mixture. The likelihood is calculating frompreviously processed data. This improved method can reduce recognition time by 15.56% compared with standard Viterbi beam search algorithm.Conclusion and future work advice are given at the end of the thesis.
Keywords/Search Tags:continuous speech recognition (CSR), search algorithm, mel-frequency cepstral coefficients(MFCC), hidden Markov model(HMM), Viterbi beam search algorithm, Gaussian mixture density, Baum-Welch algorithm, forward-backward algorithm, state tying
PDF Full Text Request
Related items