Font Size: a A A

A Study Of Speech Keyword Recognition Technology

Posted on:2009-11-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:C L SunFull Text:PDF
GTID:1118360245969617Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Keyword recognition is an important area in the speech recognition. Keyword recognition technology has more flexibility and applications than continuous speech recognition. In this dissertation, we investigate several keyword recognition technologies. These technologies can be deployed in different real word applications. The garbage model based keyword spotting technology can be equipped into real-time spoken dialogue systems and voice command systems, and the syllable lattice based and confusion network based keyword recognition technologies can be deployed in large vocabulary spoken document content retrieval applications. We focus our research on keyword detection algorithm and speech verification methods in the keyword recognition system in this dissertation. The main research contents and innovations are described in details as follows:1. Utterance verification methods in garbage model based keyword spotting (KWS) system. Likelihood ratio test method is often used to solve the utterance verification problem in GM-KWS system. A competing models based weighed log-likelihood ratio (LLR) combination verification method is proposed. Where the sub-word confidence measure (CM) is estimated by the combination of LLR scores of target model with its competing models, and the weighed coefficients are derived by minimum verification error criterion training. Experiment results show that the proposed method outperforms conventional LLR-based approaches. Though analysis of confidence features, we select on-line garbage score, dutation probability and LLR to score the confidence for keyword candidates. Experimental results show that the combination of these features can improve false rejection performance obviously.2. Keyword search algorithm and verification method in syllable lattice based KR system. Syllable lattice based keyword recognition system suffers problem of lower detection rate due to weaker language models guidance. An improved Minimum Edit Distance (MED) based keyword searching algorithm is presented and higher-order context syllabic confusion is taken into consideration when the system dependent substitute errors occur. In keyword verification stage, a new confidence function is presented to suppress the false alarms incurred by MED searching. Reported experiments demonstrate that the proposed Keyword search algorithm and verification method dramatically outperforms conventional string match search methods with higher detection and verification capability.3. Syllable confusion network (SCN) based spoken document content retrieval technology. We design a syllable confusion network based spoken document content retrieval system and investigate the indexer mechanism in retrieval system. Experiment results show that the SCN based retrieval system explicitly outperforms the syllable lattice based system. A modified two-stage decoding based query automatic expansion scheme is presented. We achieve this by first using a modified Viterbi decoder to generate a lattice of confusable syllable for the query, and then run A* search to generate the mostly confusable phrases from this lattice. The number of expansion terms was controlled by the confidence score of term. Experimental results show that the proposed scheme can effectively improve term detection rate.4. Speech recognition error correction method. A divide-and-conquer error correction scheme is introduced in a Chinese syllable recognition task. We firstly transform the continuous speech recognition problem into sequential, independent, classification tasks. Each of these sub-tasks is an isolated word recognition problem and specialized SVMs are trained and applied to each problem to discriminate the recognized candidates from confusion networks. A codebook mapping method is proposed to convert variable length speech sequence into fixed dimension feature vectors suitable for SVM processing. The posterior estimate of SVM and SCN is combined to get the confidence score for fixing recognition error. Experimental results show that the proposed approach can improve the recognition accuracy effectively.
Keywords/Search Tags:keyword recognition, confidence measure, utterance verification, error correction, support vector machine
PDF Full Text Request
Related items