Font Size: a A A

Scheduled Sampling For Neural Machine Translation

Posted on:2022-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiuFull Text:PDF
GTID:2518306563473544Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Machine translation is a technology that uses computational linguistics technology to automatically translate a source language into a target language.In view of the far-reaching research significance and broad application prospects of machine translation,industry and academia regard it as a key research direction,and it has become one of the most challenging research fields in natural language processing.With the vigorous development of neural network technology and the further enrichment of data resources in recent years,neural network-based machine translation systems(NMT)have gradually become the mainstream method in the field of machine translation.However,the current neural network machine translation model is suffering from the problem of inconsistency between training and testing,namely,the problem of exposure bias.In current studies,researchers have proposed a variety of solutions to the problem of exposure bias,among which the most representative method is the scheduled sampling algorithm.Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens,thus bridging the gap between training and inference.Although scheduled sampling has been widely used and has been continuously improved in existing studies,the core scheduling algorithm for scheduling sampling still has two key shortcomings.Firstly,the schedule strategy of the existing scheduled sampling algorithm only depends on the training step of the model,and the simulated translation error distribution is independent of the decoding step.However,in real inference scenarios,due to the error accumulation phenomenon,the translation error rate of the model tends to gradually increase with the decoding steps,which is inconsistent with the error distribution of the existing scheduled sampling algorithm simulation.Secondly,existing approaches only use a decay function of training steps can not reflect the real model competence,which is not effective.In view of these limitations in the existing scheduled sampling algorithm,this research innovatively proposes a scheduled sampling algorithm based on the decoding step and confidence-aware scheduled sampling.The main innovations and contributions of this paper are as follows:(1)To the best of our knowledge,we are the first to propose scheduled sampling on the decoding step.The algorithm can gradually increase the probability of the sampling model predictions according to the decoding step,and simulate a more consistent translation error distribution under the real inference scenario,further bridging the gap between training and inference and improving the translation quality of the model.Furthermore,this paper also explores how to combine our proposals with existing methods,namely scheduled sampling algorithms based on training steps and decoding steps,and proposes three simple and effective combinations,namely multiplication of probabilities and arithmetic average and composite functions.This paper verifies the correctness and effectiveness of the proposed method on three large-scale WMT translation evaluation tasks and two generative text summarization tasks.The experimental results show that our methods can significantly improve the existing scheduled sampling algorithm.At the same time,it achieves the state-of-the-art performance on the four metrics of two text summarization tasks.(2)To the best of our knowledge,we are the first to propose confidence-aware scheduled sampling,which exactly samples corresponding tokens according to the realtime model competence rather than coarse-grained predefined patterns.We further explore to sample more noisy tokens for high-confidence token positions,preventing scheduled sampling from degenerating into the original teacher forcing mode.Experiments on three large-scale WMT translation evaluation tasks verify the effectiveness of our methods.
Keywords/Search Tags:Neural Machine Translation, Exposure Bias Problem, Scheduled Sampling
PDF Full Text Request
Related items