Adaptive Boosting for Automatic Speech Recognition

Posted on:2017-09-12

Degree:Ph.D

Type:Dissertation

University:Northeastern University

Candidate:Nguyen, Kham

Full Text:PDF

GTID:1448390005469376

Subject:Electrical engineering

Abstract/Summary:

The atomic units of most automatic speech recognition (ASR) systems are the phonemes. However, the most widely used features in ASR are perceptual linear prediction (PLP) and mel-frequency cepstral coefficients (MFCC), which do not carry the phoneme information explicitly. The discriminative features with phoneme information have been shown more powerful for ASR accuracy. The process of generating the discriminative features relies on training classifiers to transform the original features to a new probabilistic features.;One of most commonly used techniques for measuring the probabilities in continuous distributions is Gaussian mixture models (GMM). In this work, the GMM-based classifier is used to convert each acoustic feature vector to a posterior probability vector given all classes. Furthermore, an adaptive boosting (AdaBoost) algorithm is applied to combine the classifiers to enhance the performance.;The training of GMM-based AdaBoost classifiers requires very expensive computation. To make it feasible for very large vocabulary speech recognition systems with thousands of hours of training data, we have implemented a hierarchical AdaBoost to split the whole training to multiple parallel processes. The speed up reduced the training data time from about more 100 days to within a week.;The AdaBoost features were then used successfully to combine with spectral feature for ASR. Compared to the baseline of the standard features, the AdaBoost system reduced the word-error-rate (WER) by 2%. Moreover, the AdaBoost system also contributed consistent gains on the system combination even compared with a very strong baseline.

Keywords/Search Tags:

Speech, ASR, Features, Adaboost, System, Used

Related items

1	Pornography Detection By AdaBoost Algorithm Based On The New Features
2	The Research Of Dimensional Speech Emotion Recognition Based On Neural Network And Fusion Features
3	Research Of Speech Emotion Recognition Based On AdaBoost And ELM
4	Statistical modeling of heterogeneous features for speech processing tasks
5	Exploring deep learning methods for discovering features in speech signals
6	A study of meta-linguistic features in spontaneous speech processing
7	Research On Statistical Parametric Speech Synthesis Integrating Speech Production Mechanisms
8	The Research Of Face Recognition Based On AdaBoost Algorithm And Fisher Criterion
9	Facial Features Localization Based On AdaBoost And Color Information
10	The Research Concerning The Features Of Mandarin Speech