Font Size: a A A

Research On Syntax-based Model Enhancement Methodology For Sequence-to-sequence Model

Posted on:2020-10-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:C P MaFull Text:PDF
GTID:1368330614950828Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sequence-to-sequence model is one of the most popular models in the field of artificial intelligence,especially in the area of natural language processing.This model is able to convert one sequence to another sequence directly within a unified framework.A lot of problems can be converted to the problem of this form,which can further be solved using the sequence-to-sequence model.For many applications such as machine translation,syntactic parsing,and speech recognition,by converting the formation of the inputs and outputs,we can obtain a unified solution for all these problems.On the other hand,the use of syntactic information is one of the most important topics in computational linguistics Many previous researches have proved the effectiveness of using syntactic information for natural language processing.Therefore,using syntactic information to improve model performance is quite important for researchers.Based on the analysis above,this thesis focuses on the following problem:using syntactic information to enhance sequence-to-sequence model.The sequence-to-sequence model is composed of three parts:encoder,attention mechanism module,and decoder.For both the encoder and the decoder,the lowest modules are the word embedding modules.Meanwhile,we can add an additional output module above the hidden layer,to guide the learning process of the hidden layer.Therefore,this thesis does some researches on enhancing the sequence-to-sequence model by incorporating syntactic information in the following modules:the word embedding module,the attention mechanism module,and the output module.Furthermore,this thesis makes an analysis on the fundamental principle of attention mechanism,and proposes a novel general attention mechanism.Specifically,this thesis makes the following contributions.First,this thesis proposes three methods to incorporate syntactic information to word embedding module.To cope with the problems of conventional syntax-based models that being sensitive to parsing errors,we propose an encoding algorithm for compressed syntactic forests.For the new state-of-the-art Transformer,we propose a new positional encoding method,so that syntactic positional information is incorporated to word embedding module.To cope with the problems of long linearized sequences,we propose a method based on neural syntactic distances.All these three methods succeed to enhance the word-embedding module with syntactic information in different aspects.Second,this thesis proposes three methods to incorporate syntactic information to attention mechanism module.Deterministic attention makes the constituent parser based on sequence-to-sequence model use linguistic information on constituent parsing to guide the learning process of the model.Syntax-based self-attention makes syntactic information constrain the self-attention module.Attention mechanism based on forests makes the decoder calculate the context vector on the basis of the qualities of constituent trees,so that generate target words with better context vectors.Third,this thesis proposes a method to improve representations of hidden layers by adding an additional output layer above the hidden layer.This additional output layer is able to predict the sequence of neural syntactic distances.To guide the learning of this sequence with true neural syntactic distances,syntactic information is merged into the hidden layer,so that the qualities of vectors in the hidden layer are improved.Fourth,this thesis does some research on the fundamental principle of attention mechanism module in sequence-to-sequence model.For the task of word alignment in machine translation,we compare the differences of two sequence-to-sequence models with different neural networks,and give some new insights of the attention mechanism module in the sequence-to-sequence model.Furthermore,this thesis proposes a new attention mechanism,i.e.,the axiomatic attention mechanism,which is applicable to any sequence-to-sequence models,and independent to specific neural networks.This new attention mechanism can learn word alignments quite well.This thesis makes the sequence-to-sequence model be enhanced significantly by fully using syntactic information.This is quite helpful to machine translation,syntactic parsing,and some other tasks of natural language processing.Furthermore,this thesis gives some new insights on attention mechanism,which is helpful for other researches on sequence-to-sequence models.
Keywords/Search Tags:sequence-to-sequence model, syntactic information, model enhancement, machine translation, parsing
PDF Full Text Request
Related items