Font Size: a A A

Machine Translation Through Target-Bidirectional Decoding Consistency

Posted on:2020-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhangFull Text:PDF
GTID:2428330590473267Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of deep learning technology,the Neural Machine Translation(NMT)model based on Seq2 Seq has made a great breakthrough,even surpassing the traditional Statistical Machine Translation(SMT)model in many language pairs.However,both RNN-based Seq2 Seq and self-attention-based Transformers have the problem of exposure bias: Unbalanced targets with good prefix bad suffixes are easily generated when decoding is performed.In order to solve the problem,this paper studies the process of target-bidirectional decoding,and proposes a machine translation model based on the target-bidirectional decoding consistency.The main contents of the research include the following parts:(1)Research on Target-Bidirectional Decoding: The prediction of the next token directly depends on the previously generated token when the traditional neural machine translation model is decoding.Therefore,errors in the early generation process will be saved and passed,affecting the subsequent prediction results.To overcome this issue,the reverse decoding process is also added to the decoding stage.In order to achieve target-bidirectional decoding,the candidate results of the left-right model and the right-left model are reordered using the joint model.Experimental results show that the target-bidirectional decoding contributes to the performance improvement of machine translation task.(2)Research on Machine Translation based on Bidirectional Decoding Consistency.The target-side bidirectional decoding model only finds the optimal result from 2k-best candidates by reordering,and does not solve the problem of exposure bias for the candidate.In this paper neural translation model based on bidirectional decoding consistency is proposed.That is,based on the target-bidirectional decoding model.The consistency information between the two models is added to the training objective function through KL divergence to achieve joint tuning of two models.The experimental results show that the exposure bias problem is really alleviated,and the performance of the model is in line with expectations.(3)Research on Improvement of the Bidirectional Consistent Decoding Model.Since the joint tuning is extremely time-consuming,this paper first obtains a better model through pre-training.This method can significantly improve the speed of machine translation,and guarantee the performance of the model at the same time.In addition,we have found a balance factor to adjust the contribution of KL divergence to the objective function through a large number of experiments,and the performance of the model has been further improved.
Keywords/Search Tags:Neural Machine Translation, bidirectional decoding, KL divergence, Consistency learning
PDF Full Text Request
Related items