Font Size: a A A

Parallel Optimization Method In Language Model For Mandarin Speech Recognition

Posted on:2011-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:W JinFull Text:PDF
GTID:2178330338990323Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Language model is a key part of automatic speech recognition system. When the corpus gets larger and larger and the speech processed becomes more complex, Language model, the back-end of the speech recognition system, has become a main part of the system and we cannot even recognize the speech without language model in some conditions. An appropriate language model provides the system with higher precision and the ability to process spontaneous speech. It makes the result, the output, of the speech recognition system more reasonable, and makes the meaning of the speech more close to what the speaker said.When we have finished lots of the testing procedures, we found that speech recognition systems, as a dialogue system, should provide the user of the system with a friendly interface and high speed of recognition process. Thus, we need to produce and to optimize the system with high throughput. Recently, multi-core technology and parallel technology becomes hot, these new technology give us new method to promote the system we have and to produce systems with higher performance.In the specific background, we propose that we should improve the function of the speech recognition systems with these new technologies. A new data model is designed to implement the fast seeking procedure in Tri-gram model, and with the data model we cached and optimized the N-gram language model. This method improve the system quite a lot, with the throughput three times as before,Besides, we have some more methods to optimize the computation in the model. These methods don't need large memory and multi-core architecture. Some of them are to inline function calls which are often called, to optimize computation with high cost (such as log look up table), to remove the redundant computation in the model, etc. These methods can also improve the function and throughput of the recognition system.Finally, we use the parallel optimization technology to optimize the foreword-backward algorithm in language model; it can reduce the waiting time of the user and improve the interface. We also optimized processing of the candidate sequences with different length with parallel technology. The testing data contains 120 sentences with 644.3 seconds. The recognition speed is highly improved, the real-time factor is 0.2375, and accelerated scale is about 3.3 times. The experimental result proved the advantage of the system with parallel optimized back-end.
Keywords/Search Tags:Speech recognition system, N-gram language model for Pinyin with tone, Parallel optimization, Throughput
PDF Full Text Request
Related items