Font Size: a A A

CCG Supertagging Based On Deep Learning Models

Posted on:2019-02-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:REKIA KADARIFull Text:PDF
GTID:1368330566497847Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Making computers understand the human languages and manipulate them has been a subject of research in Artificial Intelligence(AI)for long years.Interacting with computers with AI systems using natural languages is often referred to as Natural Language Processing(NLP).NLP has many applications that are widely used in our daily lives.Sequence labeling is one of the oldest fields in NLP including many tasks such as part of speech tagging and Combinatory Categorial Grammar(CCG)supertagging.The CCG supertagging is devoted as the first important stage for many NLP applications following which further processing like chunking and parsing are done.The CCG supertagging can be defined as: given a sequence of words,the goal is to assign a CCG supertag to each word in the sequence.The major challenging problem of the CCG supertagging is related to the huge number of the category set and the large number of the assigned categories to each item which makes many applications very complex.This becomes a critical task in the NLP community.Considerable approaches have been proposed to deal with the CCG supertagging problem where solutions are often based on different statistical machine learning models.However,most current machine learning methods work well because of well designed human representations and inputs features.In recent research,automatically extracting features that contain information about input representations is very important.Deep learning can be seen as putting back together representation learning with machine learning.It attempts to jointly learn good features and inputs representations.In this thesis,we focus on the CCG supertagging task,in order to propose and develop some techniques,which allow reducing the number of the assigned lexical categories to each word in an input.Our goal is the development of simple and accurate models that can solve the challenging problem of CCG supertagging and learn the necessary intermediate representations of input entries without the need for extensive features engineering based on deep learning models.We believe that there are three main problems of the current CCG supertagging models.The first problem is related to model long sequences where Recurrent Neural Networks(RNNs)fail to do and tend to memorize information just for few time steps.Because deep learning models benefit from input levels and statistical machine learning algorithms benefit from output dependencies;the second problem is related to outputs dependencies as the necessity of a model that can benefit from both input and output dependencies is very necessary to the CCG supertagging as a structured prediction task.And the third problem is related to the Out-Of-Vocabulary(OOV)words where the existing models' accuracy decrease in the presence of unseen and rare words.For this reason,the general objective of this thesis is to propose novel techniques for the CCG supertagging problem based on deep learning methods,in order to improve the capability to reduce the number of the predicted supertags and solve the above mentioned problems.Furthermore,no lexical or hand-crafted features were required.In particularize the following specific issues are considered in this work:1)How to memorize information from sequential data,is still a critical task for many sequence tagging tasks and for the CCG supertagging in particular.We present a new method for CCG supertagging based on Gated Recurrent Unit(GRU)networks.In order to save input data from both left and right direction Bidirectional GRU(BGRU)model is used.Moreover,a deep architecture is adopted in order to learn complex interactions between input entries.The reported results of the proposed model improve the supertagging and multi-tagging performance for the CCG grammar.2)We present a new method named "Backward-BLSTM" for CCG supertagging.Long Short-Term Memory(LSTM)networks are adopted as a more powerful method than GRU networks to memorize information and to select the most likely predicted supertag.The proposed architecture proves its efficiency for both supertagging and multi-tagging for the CCG grammar.The experimental results show that the proposed model is efficient to model long sequences and still achieves good performance than the-state-of-the-art proposed models.3)Many approaches have been proposed for the CCG supertagging task.However,these models whether use many hand-crafted features case of machine learning strategies or use sentence level representation processing a sequence without any correlation between labels in neighborhoods which have great influence on predicting the current label case of deep learning models.Labeling a given sequence with a set of CCG syntactic categories and taking into account the tag level is a very critical point;in this work,we use the combination of Conditional Random Fields(CRF)and BLSTM models.So first the model learns sentence representation where we can gain from both past and future input features and store the data for long periods thanks to the BLSTM architecture.Afterward,the model uses sentence level tag information thanks to a CRF layer which is regarded as the output predictor.The model allows benefiting from both input and output entries and is more competent than state-of-the-art methods.The achieved results demonstrate that the proposed model outperforms the existing approaches for both CCG supertagging and multi-tagging.4)Even though some literature has made advantages of deep learning models for CCG supertagging,there still no comprehensive research on how to deal with OOV entries.With this in mind,we present a new method which explores the strengths of different embeddings in a simple and effective way.To represent morphological information between words;the pre-trained word embeddings are used to extract informative similarity between words.Then,we used characters embeddings in which are mapped the lookup tables of characters.BLSTM networks are used for both characters and words embeddings then concatenated together to generate the final outputs.The experimental results show that our method produced the best performance than word embeddings based models on both in-domain and out-of-domain datasets.For the CCG supertagging problem,a deep study of the literature is carried out,and the limitations of the currently published techniques are highlighted.Starting from this analysis,novel approaches are theoretically proposed,implemented and tested on several datasets to verify their effectiveness.The achieved experimental results confirm the effectiveness of all the proposed techniques.
Keywords/Search Tags:Natural Language Processing, Combinatory Categorial Grammar, CCG Supertagging, Deep Learning, Neural Networks
PDF Full Text Request
Related items