End To End Mispronunciation Detection And Diagnosis

Posted on:2021-08-09

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q Feng

Full Text:PDF

GTID:2518306569494804

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Nowadays,with the growing needs of learning languages,higher performance of Computer-Aided Pronunciation Training(CAPT)systems are demanded.One of the key technologies of CAPT system is mispronunciation detection and diagnosis(MD&D).Camparing to traditional teachers,CAPT system has the advantages of low cost and high flexibility,which is favored by more and more L2 language leaners.MD&D can be treated as a special type of automatic phone recognition.When the recognized phones differ from the canonical productions(obtained from the text promts presented to the speakers),mispronunciation detection and diagnosis are achieved respectively.For the task of mispronunciation detection,this paper first explores the effect of traditional unsupervised mispronunciation detection methods.In order to varify the necessity of designing and training models for the task of mispronunciation detection and diagnosis,this paper compares the results of different goodness of pronunciation algorithms in unsupervised mispronunciation detection methods.For the task of mispronunciation detection and diagnosis,this paper first constructs a set of data processing process,including audio feature extraction,phoneme information normalization and data enhancement strategy.Then,I design several single-mode phoneme sequence labeling models with different structures.Through comparative experiments,the effectiveness of the data enhancement strategy is varified and the best single-mode phoneme sequence labeling model structure is proved.For the fact that text information is known before the task of mispronunciation detection and diagnosis,a multimodal phoneme sequence labeling model is constructed.Through attention mechanism,this model can align the audio information at each position with the text information to achieve better phone classification results at each position.Experimental results show that the multi-modal phone sequence labeling model proposed in this paper has a significant improvement in all indicators compared with the single-mode phone sequence labeling model.According to the characteristics of the dataset that it was recorded by people from different countries with different first languages,this paper explores the strategy of improving the model effect by integrating the first language information into multimodal phone sequence labeling model.In order to achieve this target,this paper explores the construction of multi-task models and multi-input models.The experimental results show that the first language information can improve the effect of phone sequence labeling model to a certain extent,and multi-input model is the best way to integrate first language information.The best model designed in this paper is the first mispronunciation detection and diagnosis model integrates both text and first language information.Our experiments show that,among all the proposed models variations and existing models compared in out experiments,the model designed in this paper reaches the best performance on open dataset L2-ARCTIC.

Keywords/Search Tags:

computer-aided pronunciation training system, mispronounciation detection and diagnosis, end-to-end model, multimodal fusion, multi-task learning

PDF Full Text Request

Related items

1	Aided Diagnosis Based On Multimodal Medical Image Fusion And Machine Learning
2	The Visual English Mandarin Computer-Assisted Pronunciation Training System
3	Study On The Key Techniques Of Computer-aided Diagnosis For Lung Cancer In Medical Images
4	Research On Humor Recognition Based On Multimodal Fusion
5	The Design And Implementation Of Computer-aided Medical Diagnosis Platform Based On 3D Reconstruction Technology
6	Computer-aided Diagnosis System Of Skin Diseases Based On Deep Learning
7	Study Of Computer-aided Detection Methods Based On Mammographic Images
8	Research Of Computer-aided Diagnosis Of Digestive Endoscopic Image
9	Research Of Computer-Aided Diagnosis Of Digestive Endoscopic Image
10	Research On Key Techniques In Multi-phase CT Image Based Computer-aided Hepatic Lesion Diagnosis System