Quality Estimation Of Machine Translation Using Pre-training Language Model

Posted on:2020-02-09

Degree:Master

Type:Thesis

Country:China

Candidate:Z C Yang

Full Text:PDF

GTID:2428330575495001

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,neural machine translation technology has made a major breakthrough and has been rapidly applied and popularized.However,there are still some problems,such as machine translation quality estimation,out-of-vocabulary words,long sentence translation,over translation and omission and so on.Machine translation quality estimation(Quality Estimation,QE)is to study how to solve the problem of evaluating the quality of machine translation without reference translation.The research results can not only help the machine translation system to filter out low-quality translation results and build a high-quality parallel corpus,but also reduce the workload of post-translation editing.Therefore,this study is of important research significance and practical value.The existing QE methods can be divided into two cateeories,one is based on machine learning,the other is based on deep learning.These tw^o methods are committed to extracting features closely related to QE tasks,and the quality of the extracted features determines the performance of the system.Recently,the pre-training language model refreshes the best results of many natural language processing tasks and shows strong representation learning ability.Therefore,this paper mainly explores how to integrate the pre-training language model into QE tasks in order to improve the performance of QE.The main work and innovations of this paper include:(1)A machine translation quality evaluation method is proposed,which combines the machine translation features extracted from the pre-training language models such as ELMO,GPT and BERT with the features extracted by the "bilingual expert"model.The features extracted by the two models can complement each other and effectively alleviate the problem of sparse features in QE tasks.The experimental results show that significant improvements have been made on the both sentence level task and word level task.(2)A sentence-level machine translation quality evaluation method based on BERT+LSTM+MLP architecture is proposed.LSTM network encodes the high-level features of source sentences and target statements extracted by multilingual BERT into fixed-size vectors and sends them into fully connected neural networks to obtain the model prediction score.The experimental results show that this method can reach the best level of QE at present.(3)A machine translation quality evaluation method based on dependent syntactic information is proposed.The dependency label of each word in the source sentence and target translation is transformed into vector representation and concated with word vector,and then sent to the model for training to make the model learn syntactic structure information.The experimental results show that the performance of QE model has been further improved.In a word,this paper creatively proposes a method of integrating pre-trained language model and dependency syntactic information into QE task,and verifies the effectiveness,advance and practicability of the proposed method through experiments.

Keywords/Search Tags:

Quality Estimation, Machine Translation, Neural Network, Language Model, Machine Learning

PDF Full Text Request

Related items

1	Parallel Sequence Decoding In Neural Machine Translation
2	Neural Machine Translation Based Translation Quality Estimation
3	Sentence-Level Machine Translation Quality Estimation Based On Neural Network Features
4	Research On End-to-end Neural Network Machine Translation
5	Research On Machine Translation Quality Estimation Methods Considering Discourse Relation Information
6	A Transformer-based Unified Neural Network For Quality Estimation Of Machine Translation
7	Research On Unsupervised Neural Machine Translation Technique
8	Reinforcement Learning-Based Neural Machine Translation Models
9	Research On Model Learning For Machine Translation
10	Research And Application Of Machine Translation Technology On Recurrent Neural Network