Research On English-Chinese Translation Based On Google's Neural Machine Translation

Posted on:2020-08-08

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Ma

Full Text:PDF

GTID:2428330590977054

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In recent years,with the re-emergence of deep learning technology,the neural machine translation model NMT has gradually replaced the traditional phrase-based statistical machine translation method.Especially the model based on Seq2 Seq fits the end-to-end language translation mode perfectly and receives the focus of industry researchers.However,compared with traditional statistical machine translation,neural machine translation model,especially the one based on large-scale data sets,still have defects and the problems of its slower training and inference speed and incomplete translation are exposed.At the same time,due to the limitation of vocabulary scale,neural machine translation also has an out-of-vocabulary problem in unregistered words and rare words.In response to the problems of incomplete translation and OOV mentioned above,we propose the following solutions.(1)In order to solve the OOV,we combine the common stemming technique with the data compression algorithm bpe(Byte pair encoding)in English text preprocessing and propose a different sequence segmentation method based on subword.We divide the English text into a sequence of subwords with this method and the Chinese text into a sequence of characters with unigram.(2)In order to prevent the decoder from being incompletely translated,we propose an improved Attention mechanism that can enhance the ability of decoder to obtain context information.Inspired by the traditional calculation process of Attention,we adopted a two-layer computing structure in the improved Attention,which focus on the the relationship of the decoder context vectors at different moments,to improve the ability of Attention to obtain the global context information of the encoder.We named this improved Attention mechanism Deep-Attention.Based on the Google's neural machine translation GNMT,this thesis analyzes the two improvement methods mentiond above on three different scale data sets.The results show that the improved word segmentation method can effectively solve the OOV problem and improve the accuracy of model translation.This method obtains an average 1.64 points improvement of BLEU.While the Deep-Attention shows a weak advantage compared with the traditional Attention,and its BLEU value is only increased by 0.3-0.6 points.

Keywords/Search Tags:

Neural Machine Translation, Seq2Seq Model, LSTM, Attention Mechanism

PDF Full Text Request

Related items

1	Research On English-Chinese Translation Based On Improved Seq2seq Model
2	Implementation,Verification And Compression By Pruning Of Neural Machine Translation Model
3	Research And Application Of Neural Machine Translation Model Based On Attention Mechanism
4	Research On Machine Translation Model Based On Self-Attention Mechanism
5	Research On Mongolian�Chinese Machine Translation Based On LSTM Neural Network
6	Multi-subspace Attention Neural Machine Translation
7	Research On Neural Machine Translation Based On Attention Convolution
8	Research On Neural Machine Translation Combining Lexicology And Syntax
9	New Machine Translation Models Based On Improved Self-attention Mechanism
10	Research Of Chinese Text Correction Based On Neural Machine Translation