Font Size: a A A

Advanced Data Augmentation Strategy For Neural Machine Translation

Posted on:2020-03-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z R ZhangFull Text:PDF
GTID:1368330575466578Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,neural machine translation(NMT)has achieved rapid develop-ment,replacing traditional statistical machine translation(SMT)and becoming the mainstream paradigm of machine translation application and research.However,NMT systems heavily rely on the large-scale,high-quality parallel corpus,resulting in poor performance in low-resource and domain-specific settings.In order to solve the prob-lem of data sparsity in neural network training,data augmentation is a very promising and effective method.This method has been widely used and obtained good results in computer vision and natural language processing,but it is still not well applied in NMT.This thesis mainly explore how to fully leverage data augmentation to improve NMT systems,and deign different data augmentation strategy for three translation set-tings,including semi-supervised,supervised and unsupervised settings:· In the semi-supervised setting,a novel data augmentation method is proposed to effectively exploit abundant monolingual data.By extending the back-translation method,this thesis designs a new joint training framework,which leverages the joint expectation-maximization algorithm to train source-to-target and target-to-source NMT systems together.In the entire training process,this method first uses bilingual data to pre-train NMT systems and then update the bidirectional NMT models with monolingual data iteratively.In each iteration,the two NMT models translate the monolingual data from one language to another,with which the pseudo-training data is constructed.Based on this,the method jointly op-timizes bidirectional models with parallel and pseudo training corpus.Large-scale Chinese-English and English-Chinese translation experiments show that this method can significantly improve translation performance.· In the supervised setting,a new data augmentation method is proposed to miti-gate the exposure bias problem by leveraging target-bidirectional agreement.As different translations generated by bidirectional decoding have a certain degree of complementarity,this property can be used to alleviate the exposure bias prob-lem in NMT models.To this end,this thesis proposes a novel model regulariza-tion method for NMT training,which aims to improve the agreement between translations generated by left-to-right(L2R)and right-to-left(R2L)NMT de-coders.Specifically,two Kullback-Leibler divergence regularization terms are introduced into the NMT training obj ective to reduce the mismatch between out-put probabilities of L2R and R2L models.Then the method integrates the op-timization of L2R and R2L models into a joint training framework,in which they act as helper systems for each other,and both models achieve further im-provements with an interactive update process.Experiment results conducted on Chinese-English and English-German translation tasks demonstrate that our pro-posed method not only significantly outperforms state-of-the-art baselines,but also effectively alleviate the exposure bias problem.· In the unsupervised setting,a new data augmentation method is proposed to lever-age the robustness of SMT system to handle the data noise problem.In the un-supervised training process,since the pseudo training data is generated by the unsupervised translation model,a large number of random errors and noises are inevitably introduced.To this end,this thesis designs a novel unsupervised train-ing method that introduces phrase-based SMT model as posterior regularization to denoise and guide the training of unsupervised NMT models.In this way,the negative effect caused by errors in the unsupervised training process can be allevi-ated timely by SMT filtering noises from its phrase tables.Then this method uses the expectation-maximization algorithm to unify the updates of SMT and NMT models,in which all models are jointly trained and gradually benefited.Large-scale machine translation experiments verify the effectiveness of our proposed method in mitigating data noise,and this approach also achieves the new state-of-the-art translation performance in unsupervised machine translation.In addi-tion,this unsupervised training method is adapted for the language style transfer task,and the corresponding experiments further confirm the practicability of this method.In short,this work has investigated the design and application of data augmenta-tion in NMT.For three different translation scenarios,this thesis designs the special data augmentation strategy to improve the performance of NMT models and obtain sig-nificant improvements.Meanwhile,this thesis also provides new methodologies and perspectives for NMT research.In the future,I would like to further apply these data augmentation strategy in other natural language processing tasks.
Keywords/Search Tags:Neural Machine Translation, Data Augmentation, Semi-supervised Learn-ing, Target-bidirectional Agreement, Unsupervised Training
PDF Full Text Request
Related items