Discriminative Training Based On TANDEM For Speech Assessment And Evaluation System

Posted on:2011-07-05

Degree:Master

Type:Thesis

Country:China

Candidate:S Gong

Full Text:PDF

GTID:2178360308955288

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

In recent years, the speech assessment and evaluation systems with the represent of computer assisted language learning system are more and more applied in the oral exams and language learning activities. These systems can not only help teachers give scores of oral tests much more objectively and efficiently but also give students'pronunciation proficiency evaluation immediately and accurately. Now most of speech assessment and evaluation systems use maximum likelihood estimation for providing estimates for the parameters of models based on MFCC Features. This popular statistical method has also some disadvantages. When there are confusable models or the training data is limited, it is unlikely to reach an optimization solution. To solve this problem, this thesis proposes discriminative training criterions and TANDEM feature which can improve the performance of the current speech evaluation system.The whole thesis is organized as follows:Chapter 1 gives a brief summary on the development and background of speech evaluation, then, we explain the basic principle and system structure for speech scoring system and speech error detection system respectively. Finally, we give introduction to some concept of speech recognition technology as the foundation of speech evaluation, such as acoustic features, acoustic model, language model and so on.Chapter 2 gives an overview on Bayesian decision theory firstly. To overcome the weakness of MLE, we bring discriminative training methods for hidden Markov models into speech evaluation system. Four typical discriminative training criterions and some updating methods of acoustic model parameters are introduced, then, they are defined in a unified framework. Finally, we analyze the relationship between the target of speech evaluation task and the objection function of each discriminative training criterion. This thesis proposes that the choice strategy of the discriminative function must be consistent with the measure of pronunciation evaluation.Chapter 3 compares HMM/ANN framework with HMM/GMM framework at first. HMM/ANN has the advantages in discriminative training abilities over HMM/GMM. However, incremental enhancements such as speaker adaptation and discriminative parameter estimation were not easily implemented in it. In this work, we apply the TANDEM approach which combines neural-net discriminative feature processing with Gaussian-mixture distribution modeling to Mandarin speech error detection system. By training MLP network to estimate the probability distributions, then the error detection system based on HMM/GMM framework uses transformations of these estimates as the input features. In this chapter, the experiment results show a large improvement in error-detecting performance, especially using maximum likelihood linear regression adaptation.Chapter 4 gives an analysis on chance for combining TANDEM feature with discriminative training method, then, we introduce the system structure, scoring features and performance measurement for English speech scoring system. Finally, we design and build four systems, namely MFCC-MLE, TANDEM-MLE, MFCC-MPE and TANDEM-MPE. We test on them with Child data set and Middle data set. The experiment results show discriminative training based on TANDEM achieves the best evaluation performance which significantly outperforms MLE based on MFCC.Chapter 5 concludes the thesis. The possible improvements are also discussed here.

Keywords/Search Tags:

Speech Evaluation System, Speech Error Detection, Speech Scoring, Discriminative Training, Minimum Phone Error, TANDEM, Multi-Layer Perceptron

PDF Full Text Request

Related items

1	Discriminative Methodologies For Tone Problem Solving In Mandarin Speech Recognition
2	Research On Discriminative Training In Speech Recognition
3	Research On Simultaneous Speech Detection And Magnitude Squared Spectrum Estimation Approach For Speech Enhancement
4	Speech Enhancemwnt In High Noise Environment
5	Research On Speech Emotion Recognition Based On Attributes Evaluation And Multi-layer Perceptron
6	The Research On Segmentation Acoustic Model Based On MPE Tibetan Lhasa Dialect
7	Integration of multiple knowledge sources in speech recognition using minimum error training
8	A Research On Speech Synthesis Based On Statistical Modeling And Pronunciation Error Detection
9	Post-Processing Technique For Speech Recognition
10	Research On Decoding Technology Of Chinese Continuous Speech Recognition