Font Size: a A A

Research On Automatic Evaluation Methods Of Mandarin Pronunciation

Posted on:2014-11-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:R ZhangFull Text:PDF
GTID:1268330392972527Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The steady economic growth in China has attracted more and moreforeigners learning Mandarin, but the learning environment restriction andprofessional teaching staff missing have been to hinder the promotion ofMandarin international dissemination. Therefore it is a top priority to develop theauxiliary teaching Mandarin as a foreign language system. Automatic evaluationof pronunciation is the one of key technologies and the evaluation results are thebasis of feedback which will effectively improve learning efficiency. Mandarin isa monosyllabic tonal language which is significantly different with theIndo-European languages in phonics: The syllable of Mandarin can be analyzedinto three parts including the consonants, vowels and tone, which tone as thelinguistics distinguishing characteristics can express idea. Therefore, Mandarinautomatic evaluation is not only to consider the speech signal changing andredundant, but also to take the actual need of teaching for Mandarin of foreignlearners and their pronunciation characteristics in to account. Meanwhile, theternary structure of Mandarin syllables makes learners’ pronunciation errors sodiverse and complex that the single acoustic feature is difficult to achieve thedesired results for the multi pronunciation errors detection. The more acousticauxiliary features should be selected for the automatic pronunciation evaluation.In a word, it is very important and valuable that researching the automaticevaluation method of Mandarin pronunciation for the Mandarin internationalpromotion.In this paper, the automatic pronunciation evaluation methods aboutMandarin of foreign learner are studied. In order to reduce the impact oflanguage transfer and promote the scope of new voice establishment, the featuresand models of automatic pronunciation evaluation have been constructed. It is toevaluate the test speech from the four aspects: consonant, vowel, tone, andintegration. The contents of this paper are designed as follows:(1) An automatic pronunciation evaluation method based on the perceptionof phoneme models is studied, which method is help to resolve the Mandarinreplacement by the analogous pronunciation of tongue. The method reduces thespeech discrimination errors by increasing the probability distribution distancesbetween the standard pronunciations and non-standard pronunciations based onthe principle of minimizing Bayesian error. First, the matrix of perception isconstructed by computing the phoneme model distance of Mandarin standardcorpus and Mandarin nonstandard corpus. Second, phoneme models are added into alternative set of recognition network in descending orders based on theperception matrix that is computed in a different corpus. Last, the alternative setof recognition network is help to reduce the search space of recognition andimprove the accuracy of the speech samples evaluation. The experimental resultsshow that the method used perception of phoneme models to choose thecandidate models for the analogous pronunciation of tongue,not only reduces thecomputational complexity of the speech recognition net, but also increases thecorrelation between automate scoring and expert scoring and whilst improves theaccuracy of pronunciation error detection.(2) Two stable tone features are studied in order to solve the problem of tonefeature various in continuous speech, which method is help to resolve the tonecategory feature affected by the intonation in tone teaching. The first one tonerepresentation method uses the sub-section linear fitting algorithm to analyze thefundamental frequency curve in syllables and uses the coefficients of linearfitting to represent the tone. This representation is helpful to reduce the contextimpacts in continuous speech. The hypothesis testing is used to improve theaccuracy of the tone evaluation results. The second tone representation is basedon separated rhythm from the fundamental frequency contours. The Fujisakimodel is used to separate the different rhythm features from speech fundamentalfrequency curves. First, the linguistic knowledge is used to determine theprosodic boundary for computing the components of Fujisaki. Next, the tonefeatures are achieved under the premise of reserving the intonation. Last, alinguistic classification and regression tree is constructed to classify the tone incontinuous speech. The experimental results show that the former method has aunique advantage in the rising tone error detection; In the latter method, thelinguistic knowledge can improve the accuracy of fitting results of Fujisaki andcombination of Fujisaki model with linguistic classification and regression treecan effectively improve the accuracy of tone evaluation. Its performance issuperior to the traditional pronunciation evaluation method based on fundamentalfrequency curves.(3) An automatic pronunciation evaluation method based on the formantstructure is studied, which method is help to detect the pronunciation errorscaused by the fixed articulator. The method defines a formant structure feature toreduce the distortion of formant caused by vocal tract length normalizationalgorithm and environment noise. The distortion between two formant structuresare computed and used to automatic pronunciation evaluation. The experimentalresults show that the rate of vowel errors detection and the performance ofautomatic pronunciation evaluation are superior to the automatic pronunciationevaluation method based on formant feature. (4) A hierarchical objective evaluation results integration method based onfuzzy is studied. First, the method introduces the pronunciation errors analysis asthe middle layer by constructing an objective evaluation results integrationmodel based on hierarchy, which is changing the direct mapping betweenobjective pronunciation results and experts scoring to indirect mapping. Next, afuzzy function is used to analyze the integration results in different hierarchy.The experimental results show that the middle layer of pronunciation errors isnot only to provide the analyzed results of pronunciation errors, but also canimprove the accuracy of automatic pronunciation scoring by introducing thecorrelation knowledge between the pronunciation errors and experts scoring. Thefuzzy logic used to simulate the process of expert’s subjective evaluation canimprove the accuracy of the integration of objective evaluation results.
Keywords/Search Tags:automatic pronunciation evaluation, perception of model, fundamental frequency, tone recognition, formant
PDF Full Text Request
Related items