Font Size: a A A

Research On English-Chinese Translation Based On Improved Seq2seq Model

Posted on:2019-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2428330545499750Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Machine translation is an important topic in the field of natural language processing.It has great research value and broad commercial application prospects.The best method in the field of machine translation today is the neural machine translation models first proposed in 2014.The most popular one among the neural machine translation models are the attention-based seq2seq model.However,the existing seq2seq models are mainly optimized and evaluated on the Indo-European language family,and there are few optimizations for Chinese.Besides,the existing model does not take into account the transformation of syntax between different languages.This paper aims at the Chinese characteristics,using different methods of text preprocessing and embedding layer parameter initialization,and improves the structure of the seq2seq model by adding a conversion layer for syntax changes between the encoder and the decoder.The main work of this paper is as follows:1.We propose different text preprocessing methods.In natural language processing tasks,unstructured text data needs to be first converted to a computer recognizable data format through preprocessing.The traditional Chinese preprocessing method in the translation system is to convert Chinese sentences into word sequences through word segmentation.However,this method relies on the accuracy of word segmentation and can result in large Chinese vocabulary.Aiming at the characteristics of Chinese,such as large number of Chinese characters,large entropy of character information and strong ideographic ability,this paper proposes a preprocessing method that converts Chinese sentences into character + named entity sequences by named entity recognition.Through experiments,it is found that using this preprocessing method can reduce the translation model's parameter scale and training time by more than 18%in the English-Chinese translation task,and the translation performance can be improved by 0.3-0.5 BLEU.2.We propose different embedded layer parameter initialization methods.The embedding layer is the first layer in a neural network model for text processing,converting pre-processed character sequences into numerical vector sequences to support subsequent numerical calculations.The choice of parameter initialization method in deep learning is crucial to the convergence of the model.In the existing translation model,the pre-trained word embedding is usually selected as the initialization value of the embedding layer parameter.However,because the translation system needs to use word embeddings expressions in two different languages,and pre-trained word embeddings are trained in corpus of different languages,resulting in semantically incompatible word embeddings of different languages.Therefore,this paper proposes that in the English-Chinese translation model,the English-side uses GloVe to initialize the embedded layer parameters and the Chinese-side uses random initialization.Through experiments,it is found that the English-Chinese translation model trained using this parameter initialization method has a 0.3-0.6 BLUE improvement in the translation performance of small and medium-sized corpora.3.We improve seq2seq model architecture,propose the conversion layer structure.In the existing seq2seq model,the source language sequence generates a representation vector by the encoder and then expresses the vector directly as the initial state of the decoder to generate the target language sequence.However,this structure does inot take into account the changes in syntax between different languages.Therefore,the architecture of the seq2seq model is improved in this paper.A conversion layer for grammatical changes is added between the encoder and the decoder.The conversion layer is composed of two layers of feed-forward neural networks,residual connections and normalization layer.Through experiments,it has been found that the translation performance of the seq2seq model using the conversion layer has a 0.7-1.0 BLEU improvement.
Keywords/Search Tags:Deep learning, Neural machine translation, Seq2seq model, Attention mechanism, Named entity recognition
PDF Full Text Request
Related items