The Performance Optimization Research On Large Vocabulary Continuous Speech Recognition

Posted on:2010-06-11

Degree:Master

Type:Thesis

Country:China

Candidate:J L Ou

Full Text:PDF

GTID:2178360275994203

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Large vocabulary continuous speech recognition (LVCSR) is one of the most important subjects of spoken language processing, which involves many knowledge sources and techniques such as acoustic model, language model and decoding algorithm. This paper will introduce the basic knowledge of speech recognition and then discuss how to improve the real-time performance of speech recognition systems and how to improve the recognition accuracy.Most LVCSR systems are based on statistical models, which use continuous density HMM as the underlying technology to perform acoustic modeling of speech signals. In this system, each state is a Gaussian mixture model (GMM) which is consisted of many Gaussian mixtures. For this kind of likelihood-based speech recognition systems, the state likelihoods estimation is computationally intensive. It is one of the most important reasons why the recognition is so slow. Therefore it is necessary to develop efficient techniques in order to reduce the computational overhead of likelihood computation without any degradation or a significant degradation of recognition accuracy. The likelihood computation of LVCSR systems which are based on continuous density HMM is analyzed to show that the conventional way of sequential computing is time-consuming and the likelihood computation itself can be implemented in parallel. A SIMD-based algorithm which can carry out parallel likelihood computation is presented in this paper. By taking HTK 3.4 toolkit as the baseline system and TIMIT,WSJ0 corpus as the experiment corpus, the experiment platform is built. And then the algorithm is compared to other efficient techniques such as partial distance elimination (PDE), best mixture prediction (BMP), and feature component reordering (FCR) and Gaussian selection (GS) on this platform. Experiments results show that the SIMD-based algorithm can significantly reduce the time overhead of likelihood computation without any degradation of recognition accuracy. And the performance is better than other fast computation techniques. In order to integrate the semantic knowledge with N-gram language model for LVCSR to improve recognition accuracy, the theory of latent semantic analysis (LSA) and the related techniques for applying it in LVCSR system is described in this paper. And then LSA model is constructed on the WSJO text corpus. We use the interpolation method to combine this model with conventional N-gram to form a hybrid language model which include semantic knowledge. To optimize the performance of the hybrid model, we apply k-means algorithm to perform vector clustering in the LSA vector space while the density function is used to initialize the centroids, and propose a computation method for smoothing the probabilities. The model perplexity tests and continuous speech recognition experiments are conducted on the WSJO corpus. Results show that the constructed hybrid language model outperforms the corresponding N-gram and can improve the recognition of LVCSR to some extent.

Keywords/Search Tags:

Fast Likelihood Computation, Latent Semantic Analysis, Large Vocabulary Continuous Speech Recognition

PDF Full Text Request

Related items

1	Research And Development Of Continuous Speech Recognition Based On HTK And Microsoft Speech SDK
2	Research On Large Vocabulary Continuous Speech Recognition Based On Deep Learning
3	A Study Of An Irrelevant Variability Normalization Based Large Vocabulary Continuous Speech Recognition
4	Establishment Of Mandarin Large Vocabulary Continuous Speech Recognition Based On Hybrid ANN/HMM Models
5	Real-time speaker -independent large vocabulary continuous speech recognition
6	Application Of Convolutional Neural Network In Large Vocabulary Continuous Speech Recognition
7	Modeling lexical tones for Mandarin large vocabulary continuous speech recognition
8	Research And Construction Of Large Vocabulary Continuous Speech Recognition System
9	Large Vocabulary Continuous Speech Recognition Research Based On HTK
10	Research On The Matching Method Of Large Vocabulary Speech