Font Size: a A A

Research On Knowledge Expansion And Representation Learning For Implicit Discourse Relation Recognition

Posted on:2020-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2428330578977961Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Discourse relation recognition aims to study semantic and logical relationship between arguments.And it is an important research task in the field of natural language processing.The Penn Discourse Treebank corpus is an authoritative data set in the field of discourse relation recognition.The corpus falls into explicit discourse samples and implicit discourse samples,according to whether there is a connective between arguments.At present,the performance of explicit discourse relation recognition has reached more than 90%,while implicit discourse relation recognition has a low performance because it lacks explicit clues.In this paper,we propose a method based on Knowledge Expansion and Representation Learning for implicit discourse relation recognition.The main research contents include the following three aspects:(1)Using Active Learning to Expand Training Data for Implicit Discourse Rela-tion RecognitionThe existing models achieve low performance on relations with fewer training samples because of limited scale of discourse linguistic resources and the dependence of neural net-work models on a large number of training samples.Therefore,it is an effective method to improve the classification performance by adding training samples.To tackle the problem of discourse data expansion,previous studies have focused on matching a large number of external corpus through connective templates,and remove connectives from the matched explicit samples as pseudo-implicit ones.Adding these pseudo samples directly to train-ing samples may result in a decline in performance.And it is because that the removal of connectives leads to certain semantic changes and external data always has much noise.To solve the problems above,this paper uses active learning method to chooses samples with high information to join the training set,so as to improve the classification performance.(2)Stacked-Attention based Implicit Discourse Relation RecognitionThe existing research improves the performance through classification models based on neural representation learning.However,these methods ignore the key information in the argument and interactions between the arguments.To solve the problems above,this pa-per proposes a novel discourse relation recognition method based on the stacked-attention.And the method leverages self-attention distribution representation to calculate interactive-attention weights.On this basis,the method strengthens the weights of relevance infor-mation between the arguments through the fusion of self-attention and interactive-attention information.(3)Hierarchical Representation for Implicit Discourse Relation RecognitionEnhancing the interactive information between arguments can't express the whole se-mantic of the argument pair.Existing methods often regard argument pairs as independent individuals,ignoring the semantic impact of their context information.Therefore,we pro-pose a method based on hierarchical representation.This method uses word-based attention mechanism to extract more important words or phrases and leverages argument-based atten-tion mechanism to give higher weights to more important arguments.Finally,we obtain the argument pair representation with context information,which further strengthens the inter-action of information between arguments through the context-aware attention mechanism.This method further strengthens the interaction between arguments,and also strengthens the interaction between arguments and context information.The methods above alleviate problems of unbalanced distribution of samples and one-sided classification clues from the perspective of knowledge expansion and representation learning.Among the four-way classification method,Accuracy is 60.63%and Macro-F1 is 44.48%,and Accuracy is higher than the existing methods of expanding corpus.Among the binary classification methods,the methods above outperform the state of the arts in terms of relation Expansion and Temporal,and F1 is 72.41%and 37.56%respectively.
Keywords/Search Tags:Implicit Discourse Recognition, Knowledge Expansion, Representation Learning, Active Learning, Attention Mechanism, Paragraph Information
PDF Full Text Request
Related items