Font Size: a A A

Rapid Keyword Spotting In Continuous Speech

Posted on:2012-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:H YuanFull Text:PDF
GTID:2218330362950445Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Keyword spotting is a significant area of speech recognition, and detects a given set of words from continuous speech. Keyword spotting is an application of continuous speech recognition, and has the advantages of higher recognition rate, stronger practicability, less time consuming and so on.This paper mainly focuses on keyword spotting for continuous speech rapidly, and it requires to reduce the recognition time as much as possible in the condition of maintaining system performance to satisfy real-time requirement.In this paper, we discuss the principles and implementation of Viterbi Beam search algorithm in depth, and then we realize a keyword spotting baseline system, which uses offline filler models and Token Passing algorithm to recognize speech online. However, the efficiency of the baseline system can not satisfy real-time requirement. In this paper, we focus on speeding up the recognition process.Firstly, speech signal is very complicated, and we always model the state by GMM(Gaussian Mixture Model). One of the computationally most expensive steps in keyword spotting systems is the state likelihood computation. So it is a short cut to improve the performance of the systems by computing the state likelihood rapidly. Based on the approximation of GMM, we propose a new technique named Feature Similarity of Adjacent Frames. Depending on the high similarity of adjacent frames, it uses some of the maximum mixtures of the previous frame to predict the maximum mixture used by the current frame. The technique can reduce the recognition time by 29.3% over the baseline system, but the performance degraded is insignificant.Secondly, we analyze the limits of Viterbi search algorithm. The basic Viberbi Beam uses fixed pruning threshold for the whole search process which neglects the large variation in ambiguity through the search process. In this paper we introduce adaptive Viterbi Beam pruning. Based on quantile, we propose a new pruning strategy. Compared with the baseline system, we can reduce the recognition time by 35%.Thirdly, we always omit the probability of the feature sequence in the process of decoding, which leads to the recognition of the more matched words rather than the words that have the highest confidence score. In this paper, we propose a new pruning method based on confidence measure accumulation. Upon the general Viterbi Beam search, we add a process based on confidence pruning, which controls the search process towards the highest confidence measure. Comparing with fixed beam, we can reduce 5.7% recognition time, and the size of the word lattice is only 70% of the original one. So we could use more complicated verification method to reduce the false rate. Last, we use these methods together and achieve a good performance.At the end of the thesis, we give the conclusion and future work suggestion.
Keywords/Search Tags:Keyword Spotting, Hidden Markov Model(HMM), Gaussian Mixture Modle(GMM), Token Passing, Viterbi Beam Decoding, Confidence Measure(CM)
PDF Full Text Request
Related items