Acoustic Modeling For Continuous Speech Recognition

Posted on:2003-09-17

Degree:Master

Type:Thesis

Country:China

Candidate:L Xie

Full Text:PDF

GTID:2168360092966185

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Speech Recognition is a fast growing technique these years. To let the computers understand human speech and even communicate with human beings are dreams of us. Maybe in the near future, these dreams could come true. The main focus on current speech recognition technology is Continuous Speech Recognition (CSR). This thesis is a part of bilateral project between China and Belgium, which is named "Machine Vision and Speech Technology of Augmented Reality". The first target of the project is to drive and generate a Talking Head using speech recognition results. To set up a CSR recognizer is brought forward. The work to build Acoustic Models of the recognizer is the essential part of the thesis.First of all, a brief introduction on the theory of Hidden Markov Model (HMM), which is frequently used in CSR, is presented. In order to put HMM into practical speech recognition applications, three important problems have to be solved. These problems are analyzed in details. The end of this part introduces the concept of Continuous Density HMM and the types of HMM.Then comes the whole structure of our recognizer, including speech acoustic analyzing, acoustic modeling and recognition strategy. Detail analyses are focused on building acoustic HMM models, which contains the basic acoustic model selection, the English phoneme set, and the training method of HMM parameters. The embedded training algorithm is highlighted in this part.After the construction of basic HMM models, I take efforts to modify and refine the HMM models because of the poor recognition performance of those basic models. Firstly, I introduce a kind of Context Dependent HMM modelæ¡¾riphone, which cares about the neighboring context information of each phoneme. Then refinements of these context dependent models, such as mixture incrementing, parameter tying, are fulfilled. I mainly research on the state tying of triphone models. A Decision-Tree based Top-Down strategy is illustrated, which uses acoustic-linguistic questions to classify model states.The last part of the thesis shows several recognition experiments based on the upper research. The results conclude that Context Dependent models could improve the recognizer performance on a quite large extent, and Decision-Tree based parameter tying technique could reach a dynamic balance between speech corpus containing training data and model accuracy. Future work is narrated at the end of this thesis.The majority work of this thesis was done in ETRO department of Vrije Universitiet of Brussel (VUB), Belgium.

Keywords/Search Tags:

Continuous Speech Recognition (CSR), Acoustic Model, Hidden Markov Model (HMM), Embedded Training Algorithm, Context Dependent, Decision-Tree based State Tying

PDF Full Text Request

Related items

1	Research And Application Of Search Algorithm For Continuous Speech Recognition
2	Study On HMM-Based Chinese Speech Synthesis
3	Researching Of The Mongolian Acoustic Model Based On Speech Recognition
4	Study And Improve On The Mongolian Speech Recognition System
5	Research On Discriminative Techniques Of Feature Extraction And Acoustic Model Training In Continuous Speech Recognition
6	Research On Chinese Continuous Speech Recognition In Noisy Environment
7	Research Of Speech Recognition Based On Mixture Feature Extraction And Improved Continuous Hidden Markov Model
8	Speaker Recognition Based On Continuous Hidden Markov Model
9	Research And System Realization Of Tibetan Continuous Speech Recognition Technology
10	The Study Of Feature Extraction And Acoustic Modeling In Speech Recognition System