Font Size: a A A

A Research On Key Technology Of Computer Assisted Putonghua Pronunciation Assessment

Posted on:2011-04-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q S LiuFull Text:PDF
GTID:1118360305466584Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech is the most convenient means of human communication. With the development of society, the Computer-Assisted Language Learning (CALL) is more and more attentive by people. Pronunciation assessment is one of the key technologies to achieve the Computer-Assisted Language Learning system. Pronunciation assessment technique enables learners to understand their pronunciation and pronunciation ability, thus learners can put more specific learning and training in the right direction. This paper carries out some key technologies of pronunciation assessment, which is based on the statistical speech recognition technology. These key technologies include the core algorithm of pronunciation assessment, adaptive method of pronunciation assessment acoustic model, applications of duration and rate of speech in pronunciation assessment, and the score mapping model. Experimental results show the system which studied in this paper reached practical level in Putonghua pronunciation assessment. The detailed research and results of this paper are abstracted as follows.Firstly, based on the introduction of the main contents of pronunciation assessment system, the core algorithm of the pronunciation assessment-the log posterior probability algorithm was in-depth analysis. Then the paper improves the log posterior probability algorithm from several aspects, which includes:improving recognition network of pronunciation assessment according to the knowledge of phonetics to simplify log posterior probability formula; improving recognition network of pronunciation assessment according to the pronunciation error mode, which is generated by difference between the model distance which is calculated by KLD(Kullback-Leibler Divergence); improving the normalization method of log posterior probability algorithm with phoneme weighting factor which is based on the key and difficult phone of National Proficiency Test of Putonghua. These improvements are obtained better performance improvements in the baseline of Putonghua pronunciation assessment system.Then, based on the miss-matching of application environment and training environment of pronunciation assessment model of acoustic model used in pronunciation assessment, the paper carefully analysis the different needs of acoustic model between pronunciation assessment and speech recognition. The paper points out that although pronunciation assessment and speech recognition are many similarities, but there are different in purpose. Speech recognition is to "fuzzy" the different of pronunciation of vary word generated by different people and identified it as the same word, whereas pronunciation assessment is to distinguish the difference of between those vastly different pronunciations and the standard pronunciation. Finally, from the model adaption of speech recognition, this paper puts forward a selective adaptive strategy. The strategy chooses some of the relative standard pronunciation data of speaker to adapt the acoustic model of pronunciation assessment system. Also the paper analyzed the influence of amount and particle size of selected date on adaptive effect.Then, the paper studies the application of duration and rate of speech in pronunciation assessment. It investigates early research studies on the rate of speech and its applications on Text-to-Speech and pronunciation assessment, focuses on rate of speech ANGIE duration model. And based on ANGIE duration model, the paper realized the calculation of duration normalization and relative rate of speech used by Putonghua pronunciation assessment. Also, the paper introduces the method of calculating duration scores from absolute rate of speech model and relative rate of speech model, and experimental analyzed the performance of those scores.Finally, in order to building a practical Putonghua pronunciation assessment system, the paper studied the score mapping model which is used to convert the measure of pronunciation assessment to machine-predicted score. After introducing the general model which is based on multivariate linear regression algorithm and analyzing it's inadequate in practical application, the paper proposed the piecewise linear regression model to improve the mapping model. Then, the paper introduced the piecewise linear regression models based on Confidence interval judgments classification, Gaussian mixture model (GMM) probability weight and Support vector machine (SVM) classification. Also the paper validated those mapping models performance.
Keywords/Search Tags:Computer Assisted language learning (CALL), speech recognition, pronunciation assessment, model adaptation, multivariate linear regression
PDF Full Text Request
Related items