Font Size: a A A

Neural Machine Translation Research On Fusing Multilingual Encoded Information

Posted on:2018-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:D LiuFull Text:PDF
GTID:2348330536981932Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The development and popularization of deep learning has brought great changes to the field of machine translation in the past few year,making the Neural Machine Translation(NMT)application surpassing the Statistical Machine Transl ation(SMT).In this translation model,the decoder is decoded according to the a bstract representation of the source language.It is this abstract representation th at provides the possibility for multilingual machine translation.As the corpus is pa rallel to the expression of the same semantics,and the vector generated by the e ncoder is also related,then by fusing the encoded information to strengthen and e xpanse the semantic vector,which enhances the translation performance.In this p aper,we have studied the fusion encoded information neural Machine Translation on three language parallel corpora.The subject is researched from three aspects:(1)By comparing the experimental results of the platform Groundhog and Nematus on different data sets,we chose the experimental platform of the research.Then,selecting the appropriate word embedded dimension for the baseline system.Using the trilingual corpus to train the English-Japanese and the Chinese-Japanese translation system as baseline system for research on Nematus.(2)Multiligual encoded information fusion model and fusion method.Firstly,we introduce the method of vector splicing to fuse the encoded informati on of two kinds of input languages,and get a new context vector with two kinds of input la nguage information,then generate the target language sentence with this context vector.Then,this paper introduces another encoded information fusion model.In this model,after obtaining the encoded information of the two input languages,the context vectors under the attention mechanism are calculated respectively,and the new vector is obtained as the input of the single-layer feedforward neural network to generate the rich and comprehensive semantic vector.The last part is based on the first two parts of the experiment,explore other information fusion methods.Since the fused vector contains the semantic information of the two input languages,it is essential for decoding.Therefore,it is necessary to find the appropriate fusion method,which can combine the encoded information of different input languages perfectly.(3)Research on Neural Machine Translation of introducing middle language as pivot language.In this study,we regard Chinese as the pivot language.There are two reasons for the introduction of the pivot language in the neural network transl ation: firstly,to avoid the multilingual input problem in the fusion model;secondly,to make full use of the existing English Chinese corpus and the Chinese to Japanese translation model.After verifying the effectiveness of the method,this paper a ttempts to combine the pivot language translation model with the encoded info rmation fusion model,and obtain the encoded information fusion model in the pivot language framework.Due to the error of the translation model itself,the perfo rmance of the model is better than that of the pivot translation model,but it is not as good as that of fusing multiligual encoded information model.
Keywords/Search Tags:neural machine translation, context vector, fusion, pivot language
PDF Full Text Request
Related items