Font Size: a A A

Strategies For Improving Efficiency Of Speech Recognition

Posted on:2009-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y J CuiFull Text:PDF
GTID:2178360245469838Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The accuracy has always been the most important since the speech recognition technology appears. Most research are focused on further reducing the recognition error rates. The topic of the speed has been neglected in the development of speech recognition algorithms. However, the applicable recognition systems have to run in real time on common platforms. Thus, the speed is equally or more important than accuracy in a speech recognition system. As we all know, the accuracy and the computational effort are conflict to each other. This paper emphasized on how to reduce the computational effort without affecting the accuracy.This paper builds two base line systems, one is a isolated-word speech recognition system on PDA, and the other is a continuous speech recognition system on PC. The changing word list of the PDA system can be edited by users. And the PC system is going to be transferred to embedded systems. All the researches and experiments are based on these two systems.In one hand, the PDA system uses continuous HMM as the acoustic model. The system builds a linear search network according to the word list, and the decoder uses a depth first search method. In the other hand, the PC system uses semi-continuous HMM as the acoustic model and finite states grammar with the form of Deterministic Finite Automata as the language model. At the initial, the system combines the acoustic model and the language model to form a lexicon and then decodes using the traditional Time- Synchronous Viterbi Beam Search.Firstly, we optimized the systems with the basic methods. We applied the float to fixed method on PDA system to improve the speed. And then we separate the initial part and the recognition part of the PC system in order to reduce the computational effort of the initial part.And then we do some researches on the search methods of the decoder. Firstly, we compare two network construction methods of linear network and lexical tree network. And then we do some experiments about the beam search. Finally we add stack decode method as the second pass to improve the accuracy of the continuous speech recognition system.Finally, we focus on the pruning methods in Gaussian layer and dimension layer. We introduce three methods including the Nearest- neighbourhood method, BBI and Gaussian selection method. And three PDE(Partial Distance Estimate) algorithms are applied to reduce the likelihood computation cost. It supposed that the last frame could be a good guide for the coming one. One of method which uses the last frame computation result as the Gaussian pruning threshold gains a best experiment result.The PDA system improves its efficiency by 78.9% without the error rate decreased a lot, and the continuous speech recognition system improves its efficiency by 55.5% with the error rate decreased a little.
Keywords/Search Tags:Speech recognition, HMM model, lexicon, DFA, decoding algorithms, computation of the state output probability
PDF Full Text Request
Related items