Research Of Exposure Bias In Neural Machine Translation

Posted on:2022-08-10

Degree:Master

Type:Thesis

Country:China

Candidate:J X Zhao

Full Text:PDF

GTID:2518306524993679

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

Neural Machine Translation(NMT)is a task that translating texts in one language into synonymous texts in another one.It is a very hot research direction in Natural Language Processing(NLP).Nowadays,NMT plays a vital role in an increasingly international society.Most of NMT models are built based on the Sequence to Sequence(Seq2Seq)structure,which contains an encoder and a decoder.Encoder projects the sequence from the source side into a semantic space.Decoder predicts sequence in the target side iteratively based on the results of encoder.The training process and testing process of encoder in Seq2Seq are the same,they both encode the sequence from the source side to the semantic space.But the training process and testing process of decoder are different.Decoder can utilize the target sequence to help training,but the target sequence is not available at testing.Thus the model can only adopt the output of last time step as the current input.The difference between the model during training and testing can lead to the exposure bias.This thesis proposed a new NMT model for the ubiquitous exposure bias in NMT model,which improves the translation quality.The main work is as follows:(1)We proposed a model called Twin-GAN,which brings the training process and testing process closer to minimize the difference between them.We designed a new strategy namely "Similarity Selection" to choose the input of the decoder based on the characters of Seq2Seq models.There are two generators and two discriminators in our model.One generator utilizes the "Similarity Selection" at training and the other one simulates the test process at training.One discriminator makes the model distribution of the two generators close to each other,so as to reduce the difference between training and testing.Another discriminator improves the generator’s translation performance at the sentence level.The results showed that the Twin-GaN model had the best performance among the compared models,reaching 28.07 and 21.06 BLEU in the IWSLT 2014 German-English and WMT 17 Chinese-English translation tasks,respectively.(2)We adopt the idea of Zero-Shot learning(ZSL)to further solve the exposure bias in terms of data,which extends the work of Twin-GAN.We utilize the ZSL multi-task to train the model.There are ZSL translating task,Denoising Auto-encoding(DAE)task and Back Translation(BT).ZSL translating task is that the model learns the language-agnostic features from other bilingual corpus,and then transfer the model to translate the language pair that needs to be translated.DAE task is to train the model to recover a complete data from a destroyed date.The "fake parallel corpus" from BT task help to improve the translation performance of the model.The experimental results show that the translation performance of this method is better than that of other models without using parallel corpus,and it achieves an average of 37.65BLEU in the mutual translation between Spanish and French in the United Nations parallel corpus.

Keywords/Search Tags:

deep learning, neural machine translation, exposure bias, generative adversarial network, zero-shot learning

PDF Full Text Request

Related items

1	Research On Zero-shot Learning Methods Based On Generative Adversarial Networks
2	Study On Zero-Shot Learning Via Deep Generative Models
3	Image Translation Based Generative Adversarial Networks
4	Research And Implementation Of Neural Machine Translation Based On Generative Adversarial Networks
5	Research On Zero-shot Learning Based On Generative Model
6	Research And Application Implementation Of Generative Adversarial Networks Based Image Translation
7	Research On Zero-shot Image Classification Based On Generative Adversarial Network
8	Face Semantic Translation Using Unsupervised And Semi-supervised Learning Based Generative Adversarial Networks
9	Multimodal Cycle-consistent Zero-Shot Learning Based On Unbiased Embedding
10	Research On Zero-shot Learning Based On Generative Model