Font Size: a A A

Research Of Small And Medium Vocabulary Embedded Continuous Speech Recognition

Posted on:2014-10-08Degree:MasterType:Thesis
Country:ChinaCandidate:W M LinFull Text:PDF
GTID:2308330461473907Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the embedded portable devices spread and speech recognition technology develops, the application of speech recognition on embedded system platform is a hot topic of research presently. Due to the limit of computing and memory resource, the speech recognition applies mainly to small vocabulary isolated word recognition on embedded system. And the application of continuous speech recognition on embedded system platform is still based on cloud technology. It is rare in those embedded device which is not linked to networks. The main reason is that the computing and memory resource is very limited on embedded system platform. So, how to build a continuous speech recognition system under these conditions is a very significant task.For this purpose, the paper designs a small and medium vocabulary continuous speech recognition system on an embedded platform, proposes an algorithm for building the search network and optimizes the search algorithm. The system can be applied in some specific areas and meet the real-time requests of embedded system under a good recognition rate. This system is designed by the standard C library, so it can be transplanted in different embedded platforms.The continuous speech recognition system is built on ARM embedded development board. The software of system consists of audio processing module, training module and speech recognition module. The audio processing module processes audio files and extracts the MFCC characteristic parameters. Training and recognition modules use continuous Hidden Markov Model (HMM) which bases on INITIAL/FINAL units as acoustic model. The language model is rule language models. The grammar rule is written according to the regulation which references to regular expressions.Secondly, the paper introduces two training methods. They are K-Means algorithm training and embedded training. The paper compares the recognition rate of two methods under the same conditions. The result shows that the recognition rates not vary much but the implementation costs of K-Means is more than the latter.And then, the paper gives an optimized research on constructing search network and search algorithm. Firstly, it proposes an algorithm which can convert the grammar network to search net, and merges the same subsequent nodes in the grammar network to simplify the grammar net. Then, the system combines grammar network and acoustic models to form a search network which is constructed as lexical tree network. The search algorithm uses the optimized Time-Synchronous Viterbi Search. In order to improve the rate of searching, the system uses mixed pruning strategy which combines the advantages of Beam Pruning and Histogram Pruning.Finally, the paper does some performance testing for the system. Some groups of experiments about pruning threshold are done, then it analyses these experiment results. In order to improve system performance, Nearest-Neighbor and feature component reordering is adopt in the system. The optimized system has an obvious enhance in recognition speed and rate. Lastly, the paper gives the run results for the optimized system in different platforms. The results show the system is evidently. valuable in application.
Keywords/Search Tags:Speech recognition, embedded system, HMM model, language model, viterbi searching
PDF Full Text Request
Related items