Font Size: a A A

Research Of Exposure Bias In Neural Machine Translation

Posted on:2022-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhaoFull Text:PDF
GTID:2518306524993679Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Neural Machine Translation(NMT)is a task that translating texts in one language into synonymous texts in another one.It is a very hot research direction in Natural Language Processing(NLP).Nowadays,NMT plays a vital role in an increasingly international society.Most of NMT models are built based on the Sequence to Sequence(Seq2Seq)structure,which contains an encoder and a decoder.Encoder projects the sequence from the source side into a semantic space.Decoder predicts sequence in the target side iteratively based on the results of encoder.The training process and testing process of encoder in Seq2Seq are the same,they both encode the sequence from the source side to the semantic space.But the training process and testing process of decoder are different.Decoder can utilize the target sequence to help training,but the target sequence is not available at testing.Thus the model can only adopt the output of last time step as the current input.The difference between the model during training and testing can lead to the exposure bias.This thesis proposed a new NMT model for the ubiquitous exposure bias in NMT model,which improves the translation quality.The main work is as follows:(1)We proposed a model called Twin-GAN,which brings the training process and testing process closer to minimize the difference between them.We designed a new strategy namely "Similarity Selection" to choose the input of the decoder based on the characters of Seq2Seq models.There are two generators and two discriminators in our model.One generator utilizes the "Similarity Selection" at training and the other one simulates the test process at training.One discriminator makes the model distribution of the two generators close to each other,so as to reduce the difference between training and testing.Another discriminator improves the generator’s translation performance at the sentence level.The results showed that the Twin-GaN model had the best performance among the compared models,reaching 28.07 and 21.06 BLEU in the IWSLT 2014 German-English and WMT 17 Chinese-English translation tasks,respectively.(2)We adopt the idea of Zero-Shot learning(ZSL)to further solve the exposure bias in terms of data,which extends the work of Twin-GAN.We utilize the ZSL multi-task to train the model.There are ZSL translating task,Denoising Auto-encoding(DAE)task and Back Translation(BT).ZSL translating task is that the model learns the language-agnostic features from other bilingual corpus,and then transfer the model to translate the language pair that needs to be translated.DAE task is to train the model to recover a complete data from a destroyed date.The "fake parallel corpus" from BT task help to improve the translation performance of the model.The experimental results show that the translation performance of this method is better than that of other models without using parallel corpus,and it achieves an average of 37.65BLEU in the mutual translation between Spanish and French in the United Nations parallel corpus.
Keywords/Search Tags:deep learning, neural machine translation, exposure bias, generative adversarial network, zero-shot learning
PDF Full Text Request
Related items