Font Size: a A A

Research On Unsupervised Neural Machine Translation

Posted on:2021-09-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:H P SunFull Text:PDF
GTID:1488306569984209Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The training of neural machine translation(NMT)often relies on large-scale parallel corpora,but not all language pairs have sufficient parallel corpora.In order to alleviate the problem of the lack of parallel corpora in NMT,unsupervised NMT(UNMT)is proposed to model translation reling solely on monolingual corpora with the help of a combination of diverse mechanisms such as unsupervised pre-training strategy,denoising auto-encoder,back-translation,and shared latent representation.In this thesis,UNMT is mainly studied in the following four aspects:1.Pseudo-data-based unsupervised neural machine translation and distant language pair analysis.Although UNMT has achieved remarkable results in some similar language pairs,it often performs poorly in distant language pairs such as Chinese-English and Japanese-English.Therefore,we first empirically analyze the poor performance of UNMT in distant language pairs from the aspects of unsupervised bilingual word embedding(UBWE)quality,shared word and word order.In this thesis,we propose the artificial shared word replacement strategy and preordering strategy to increase the shared words between distant language pairs and reduce their syntactic differences,so as to improve the performance of UNMT in distant language pairs.The denoising auto-encoder and shared latent representation mechanism in the traditional UNMT are necessary only to bootstrap UNMT training in the early training process.Learning shared latent representation restricts the performance of translation in both directions,particularly for distant language pairs,while denoising dramatically delays convergence by continuously modifying the training data.To avoid these problems,we propose pseudodata-based unsupervised neural machine translation to train two standard NMT systems that we train with the pseudo-parallel data generated by UNMT.Our experiments show that our proposed method significantly outperforms conventional UNMT in translation quality,while achieving faster training.2.Unsupervised bilingual word embedding agreement for unsupervised neural machine translation.UBWE is only used in the initialization stage of UNMT,the UBWE quality significantly decreases during UNMT training.However,there is a positive correlation between the pretrained UBWE quality and the UNMT performance.In this thesis,we propose two UBWE agreement methods,UBWE agreement regularization and UBWE adversarial training,to train UNMT model.The UBWE agreement regularization method is to regularize the word embedding changing during back-translation training.The UBWE adversarial training method is jointly trained with UNMT,thus has more interaction with UNMT model.The experimental results show that the UBWE agreement methods can effectively mitigate the degradation of UBWE quality and significantly improve the UNMT performance.3.Unsupervised neural machine translation with cross-lingual language representation agreement.The pre-training strategy is extended from UBWE to the crosslingual masked language model(CMLM),improving the UNMT performance.The CMLM,like the UBWE,is only used to initialize UNMT model.Experimental results show that the quality of CMLM has a significant effect on the performance of UNMT not only in the initialization stage of UNMT but also during the whole UNMT training.Therefore,this thesis proposes two cross-lingual language representation agreement methods,CMLM agreement regularization and CMLM knowledge distillation,which can improve the performance of UNMT by adding CMLM training.The CMLM agreement regularization strategy can be used to train the CMLM at the encoder side during the back-translation training process,so as to further enrich the source representation of the encoder.The CMLM knowledge distillation strategy introduces the pretrained CMLM as the teacher model to guide the CMLM training,making full use of the pretrained model to improve the translation performance.Experimental results show that our proposed strategies can enrich the source representation of the translation model and significantly improve the translation performance.4.Knowledge distillation for multilingual unsupervised neural machine translation.In this thesis,we extend the study of UNMT to the multilingual scenario and propose the multilingual UNMT framework.In order to further improve the performance of multilingual UNMT,we propose two knowledge distillation methods,self-knowledge distillation and language branch knowledge distillation.We agrue that the reconstruction translation generated by the source sentences through different paths should be similar in the process of back-translation training.The self-knowledge distillation method makes better use of the multilingual information by constructing different reconstruction paths.The language branch knowledge distillation method is to introduce the language branch translation model as the teacher model to distill richer language representation to enhance the multilingual UNMT performance.Experimental results verify the effectiveness of our proposed multilingual UNMT system.Moreover,our proposed methods can alleviate poor performance in low-resource language pairs.
Keywords/Search Tags:unsupervised neural machine translation, distant language pair, sharing latent representation, bilingual word embedding agreement, cross-lingual language representation agreement, multilingual machine translation
PDF Full Text Request
Related items