A Study Of Active Learning For Acoustic Modeling In Speech Recognition

Posted on:2012-02-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W Chen

Full Text:PDF

GTID:1118330371460291

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Acoustic model for speech recognition is always trained in supervised way. Recently, along with the rapid development of modem media and fast rise of Internet, it becomes very easy to collect a large number of speech data. However, these raw samples are required for additional time-consuming and costly manual annotations. Thus, in order to train high-performance acoustic model with significantly reduced transcription effort, active learning (AL) is adopted by actively selecting the most informative samples for annotating and acoustic modeling, and it has become a very popular research topic. This paper focuses on active learning for acoustic modeling, and its main contributions and innovations are described as follows:1. A KLD based method for initial set selectionInitial set has great influence on the convergence rate of active learning, but it is usually randomly selected. We propose a KLD based method for initial set selection, the distributions of total set and several initial sets are firstly modeled by GMMs, then the KLD between the distributions of each initial set and total set is calculated, and finally the initial set with the lowest KLD is selected for active learning. The experiments show that the initial set selected by the proposed method can achieve satisfying convergence rate.2. Sample evaluation algorithms based on different confidence measuresWe firstly propose a sample evaluation algorithm based on multi-level confusion networks. Word posterior probability based on word confusion network has been proven effective for active learning, but the structures of word in Chinese are flexible, and the word posterior probability may be inaccurate due to the boundary ambiguity while generating the confusion network. We propose a unified framework to generate confusion networks of multiple levels, posterior probabilities obtained from multi-level confusion networks are respectively adopted to evaluate the unlabeled samples. Moreover, we propose a sample evaluation algorithm based on the combination of several predictors. Samples are always selected based on single predictor such as posterior probability, which cannot overall evaluate the samples. In this algorithm, the predictors of the unlabeled samples are firstly built, then the training set for support vector machine (SVM) is labeled using the recognition result evaluation method based on mixed words, and then the posterior probability output by the trained SVM is used for sample evaluation. The experiments show that these two algorithms above are effective.3. A sample confidence measure based on latent topic similariySince the performance of speech recognition has been determined by the ability of ambiguity resolution and error correction, the confidence measure extracted from high-level information sources becomes very important. We propose a sample confidence measure based on latent topic similarity, each word and context topic distributions in the sample recognition result are firstly obtained using the latent Dirichlet allocation (LDA) model, and then the proposed word confidence measure is extracted by determining the similarities between these two topic distributions. The experiments show that the proposed confidence measure has a good information complementary effect combined with confidence measures from decoding information.4. A selective training algorithm for acoustic modelingConsidering that both confidence-based active learning and semi-supervised learning ignore the confidences of the words,characters or syllables of the selected samples each iteration, we propose a selective training algorithm, try to select several units(the unit could be word,character or syllable) of the unlabeled samples based on some confidence measure for acoustic modeling and apply this algorithm to semi-supervised learning. Some initial experiments show that it is effective to combine semi-supervised learning with selective training algorithm when the selection ratio is small.

Keywords/Search Tags:

active learning, acoustic model, confidence measure, sample evaluation

PDF Full Text Request

Related items

1	Study On Key Technologies Of Active Learning In Division Classification Model
2	The Recognition Model Research Based On Whole Acoustic Structure Features Of Speech Unit
3	Researches On Algorithm For Confidence Evaluation Of SVM
4	A Study Of Feature Extraction Algorithm Of Speech Recognition Confidence Measure
5	The Research About The Acoustic Model
6	Fuzzy GMM-based confidence measure towards keyword spotting application
7	Acoustic Model Expansion Based On Active Learning
8	Research On Active Learning Method Based On Rough Set Theory
9	Discriminative Training For Large Vocabulary Continuous Speech Recognition
10	Assessment of a measure of response confidence for a speech recognition task in noise