Font Size: a A A

Research On Integrating Translation Memory Into Neural Machine Translation

Posted on:2022-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q X HeFull Text:PDF
GTID:2518306530498294Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Recently there are increasing interests in improving neural machine translation(NMT)with translation memory(TM).Because the performance of neural machine translation has greatly exceeded that of statistical machine translation,NMT has many applications in general fields,especially in some specific fields.Some achievements have been made in the research on neural machine translation based on translation memory,and there is still a lot of space for improvement.Therefore,it has great theoretical significance to research how to combine NMT and TM,which is of great theoretical significance for improving the translation quality of NMT.With the help of TM,it has important application value to improve the online service for NMT and promote the application of NMT in specific fields.Based on the characteristics of NMT and translation memory,this paper firstly discusses the importance of combining translation memory and NMT.Based on the indepth analysis of the advantages and disadvantages of the existing fusion methods,we propose our model and method.Our work is based on Transformer,to exploring the fusion method based on translation fragments,also,we propose a new model for integrating translation memory into NMT and a new training method for deep integration.The main contributions are summarized as follows:1.A word position-aware integrating method for translation memory is proposed which improves the method based on translation fragments.The previous integrating methods based on translation fragments only capture the very local context information in the translation memory,and these methods can't produce a good translation when the translation memory is very similar to the test sentence,especially,even if the training corpus contains a reference translation for the test sentence,these methods can't achieve the perfect translation.So,a word positionaware integrating method is proposed in this paper.This method captures more context information from translation memory while maintaining the efficiency of the integrating method based on translation fragments.2.We proposed the double-chain graph for integrating translation memory into NMT.The integrating method based on translation fragments has two key questions: one is which word should be rewarded in the decoding stage,the other is how much reward value should be given to the matched word.Therefore,this paper proposes a novel and effective double-chain graph structure,which constructed the word chain and position chain in translation memory to capture more context information in translation memory and achieve higher efficiency at the same time.We apply our method to a strong baseline NMT to prove its effectiveness,and this method is better than the baseline system.3.A new neural machine translation model for translation memory is proposed,which is significantly better than several strong baselines in terms of translation accuracy and efficiency.In practical application,the running time of the model is sensitive.Some of the existing machine translation systems based on translation memory take a long time to train and decode,and some processes are particularly complex.All these hinder the application of translation memory to the actual online system.And because the neural network has the super fitting ability,to let the model automatically learn the knowledge from translation and memory,so from the view of model integrating,we design a lightweight neural machine translation model with translation memory.Three methods of merging translation memory sentences from coarse to fine are explored,and the effectiveness and efficiency of our design model are proved on the specific datasets for the translation memory integrating task(up to 4.7 BLEU points)and on the general datasets of machine translation.4.In the above model,we provide a new training criterion to solve the problem of robustness in the training process.In the trained model,we find an issue of robustness,i.e.,when the similarity of the model to translation memory is low,the translation effect decreases seriously.The reason for this phenomenon is that it is overfitting in the model training.Inspired by data augmentation and multi-task learning,we propose a new training method to avoid overfitting during training.
Keywords/Search Tags:Translation Memory, Neural Machine Translation, Position Aware, Double Chain Graph, Model Integration
PDF Full Text Request
Related items