Font Size: a A A

Research Of Acoustic Modeling For Speech Recognition System

Posted on:2008-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:D PengFull Text:PDF
GTID:2178360215982483Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Acoustic Modeling is one of the key problems in the field of speech recognition. In this paper, the techniques of acoustic modeling and parameter tying strategy are thoroughly studied. Two main aspects are focused on: the basic context dependent acoustic modeling method are investigated and revised, in terms of basic acoustic unit selection and question set refinement; Moreover, some of the cutting edge problems in acoustic modeling field, such as sparse training data, optimal model selection as well as pronunciation variations are also discussed in this paper, which are illustrated in detail as follows:1. The HMM Tool Kit (HTK) platform is studied and analyzed. Based on HTK, an effective method is implemented for acoustic model training and performance evaluation. The Decision Tree (DT) based state tying strategy in the Context Dependent (CD) acoustic modeling is deeply studied. Two different DT design methods are analyzed; the design of question set and the DT node splitting strategy are discussed. Experiments base on the CD Initial/Final (IF) model with decision tree based state tying is carried out.a) To maintain the connection between Initial and Final, the Extend IF (XIF) set is proposed by adding the Zero Initials to the prior standard IF set. Experiments show that the XIF model outperforms the IF model.b) The question set design is refined based on the linguistic knowledge and the stopping criterion of decision tree is also investigated and revised. Both of those have achieved 4% of phone accuracy increase in total.2. In order to minimize the recognition errors caused by inaccurate model estimations from those toned triphones with limited training samples, we proposed to initialize toned triphones using their own toneless triphone model parameters. Besides, works concerning mixture component adaptation are also explored to obtain better performance as well as reduce model scale.3. The SCSPM is implemented and evaluated. It is based on the traditional Hidden Markov Model (HMM) and the modified HMM namely Mixed Gaussian Continuous Probability Model (MGCPM), the Vector Quantization (VQ) technique and the feature of continuous probability density distribution are integrated, and the method of Tied Mixture is adopted to describe the probability distribution of each state. Compared with the MGCPM, SCSPM can reduce the model scale and computational complexity significantly with little degradation in recognition accuracy. Moreover, research in pronunciation variation is conducted, with the main idea that the pronunciation variations are hidden in the recognition errors and it could be found out and modeled. Our work involves locating the pronunciation variations and information collection. Our primary experimental results show it is effective for Chinese continuous speech recognition.
Keywords/Search Tags:speech recognition, acoustic modeling, triphone, decision tree
PDF Full Text Request
Related items