Multilingual Neural Machine Translation Based On Data Augmentation And Model Pruning

Posted on:2023-12-28

Degree:Master

Type:Thesis

Country:China

Candidate:H Yang

Full Text:PDF

GTID:2558306845499474

Subject:Computer Science and Technology

Abstract/Summary:

Machine translation refers to the process of automatically translating a natural language text into another natural language text by an electronic computer.It is one of the most important research directions in the field of natural language processing.Neural machine translation is machine translation based on deep learning,which is driven by data,and scarce or unbalanced corpus resources will degrade the performance of multilingual neural machine translation model.At present,the neural machine translation model led by Transformer has become the most prevalent machine translation paradigm.As globalization progresses,a model that can only enable translation in a single direction cannot meet the actual needs,and the idea of multilingual translation has emerged.Multilingual neural machine translation refers to a neural machine translation system that simultaneously handles translation between multiple languages through a unique model.The main research innovations and contributions of this paper include the following two aspects.(1)For multilingual translation in Romanian language family where parallel corpus resources are scarce and unbalanced,a multilingual neural machine translation model based on fully shared parameters is proposed to integrate back-translation,utilize pivot language and other data augmentation methods.And use high-resource languages to transfer knowledge to low-resource languages,such as using pre-trained models,to improve translation quality in low-resource directions,and use temperature sampling strategies to alleviate the problem of unbalanced corpus resources.Experiments on the WMT2021 multilingual low-resource public dataset show that the translation performance of proposed method in this paper has been steadily improved in some language pairs or all language pairs.The final result of this paper increased by about 12 average BLEU compared to the multilingual baseline.(2)The paper introduces the problem of suboptimal translation in the existing fully parameter-sharing multilingual neural machine translation and offers a solution.Since the multilingual neural machine translation model shares all parameters,the knowledge between different languages "transfers and interferes" with each other,resulting in the model’s translation performance on some language pairs is not optimal.In order to improve the model’s ability to model language diversity and reduce the interference between different language pairs,this paper proposes a training method of "model pruning-knowledge distillation-parameter rejuvenation",and experimentally verified two translation scenarios,one-to-many and many-to-many,on public multilingual datasets.Compares with a strong multilingual baseline,the experimental results show that the "pruning first and then rejuvenating" method proposed in this paper can bring about 0.53 and 0.72 improvement in the average BLEU value for one-to-many and many-to-many translation scenarios,respectively,and ablation experiments also demonstrate the effectiveness of each part of the proposed method.

Keywords/Search Tags:

Neural Machine Translation, Multilingual, Data Augmentation, Model Pruning, Knowledge Distillation

Related items

1	The Design And Implementation Of Neural Machine Translation System For Multilingual Mutual Translation
2	Research On Data Augmentation Strategy For Neural Machine Translation
3	Advanced Data Augmentation Strategy For Neural Machine Translation
4	A Lightweight Multilingual Translation Model For Asian Languages
5	Data Augmentation Research Of Neural Machine Translation
6	Implementation,Verification And Compression By Pruning Of Neural Machine Translation Model
7	Research On Interpretability Of Neural Machine Translation:Model’s Representation,Training And Behavior
8	Research On Low-Resource Machine Translation Based On Teacher-Student Model
9	Research And Implementation Of Model Compression Method Based On Knowledge Distillation
10	Research On Network Optimization Of Neural Machine Translation System