Font Size: a A A

Research On Semi-supervised Mongolian-Chinese Neural Machine Translation Based On Cooperative Training

Posted on:2022-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:L X WenFull Text:PDF
GTID:2518306542978159Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development and progress of society,the communication among people becomes more and more frequent,and it is crucial to have a method of communication which can step cross the regional restrictions and break the language barriers.However,the advent of the era of big data has brought great difficulties to the traditional manual translation methods when dealing with massive language information.The emergence of machine translation technology conforms to the requirements of the era,but the application of machine translation technology is inseparable from the support of a large-scale of high-quality parallel corpora.When facing with minority languages such as Mongolian,it is very difficult to collect high-quality Mongolian-Chinese parallel corpus because of its small scope of use and relatively slow economic and cultural development.Therefore,how to make effectively,reasonably and fully use of the existing corpus to improve the quality of Mongolian-Chinese translation model has become an important research content.Based on this situation,this paper studies semi-supervised Mongolian-Chinese neural machine translation based on cooperative training from the following three aspects.Firstly,all existing corpora are preprocessed,then research on the division of data set,the confirmation of corpus scale,the segmentation of corpus and the model pre-training method of BERT are conducted.Experimental demonstration is carried out on each research contents,lays a good foundation for up-coming research.Secondly,for the problems of poor translation quality and poor generalization capability of the model,the semi-supervised condition sequence generation adversarial network is used to train the translation model.To improve the quality of the model,the problem of which neural network should be used in the generator model and in the discriminator model is studied experimentally,the strategy gradient algorithm with Bleu function Q is used in the adversarial training process of the generator and the discriminator,and the contrast experiments are carried out.Finally,for the problem of poor quality of translation model caused by the sparsity of Mongolian and Chinese parallel corpus,the cooperative training method is introduced in the training process of the translation model based on the Conditional Sequence Generative Adversarial Networks.In the application process of this method,a high-quality English-Chinese translation model which can be accessed easily is used to conduce cooperative training with Mongolian-Chinese translation model,and the confusion evaluation method is used to select a better Chinese translation generated by the generator.In addition,The English input of English-Chinese translation model is transformed from Mongolian,this involves the acquisition of Mongolian-English Parallel Corpus and the construction of Mongolian-English translation model.Finally,contrast experiments were carried out on all of the research contents(mentioned above),and drawn experimental conclusions.In the above research,the quality of Mongolian-Chinese translation models are evaluated by the BLEU method.Through the comparative analysis of the experimental data,it is found that the semi-supervised Mongolian-Chinese neural machine translation based on cooperative training has certain advantages in improving the quality of Mongolian-Chinese translation model,which will be of certain significance to the development of Mongolian language,the exchange of cultures and even the development of economy and trade.
Keywords/Search Tags:Mongolian-Chinese Machine Translation, Adversarial Training, Collaborative Training, Semi-supervised Method, Conditional Sequence Generative Adversarial Networks
PDF Full Text Request
Related items