Font Size: a A A

Research On Relational Clue Mining And Joint Learning For Implicit Discourse Relation Classification

Posted on:2017-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:S S ZhuFull Text:PDF
GTID:2308330488461865Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Discourse analysis is an important research on the inner structure understanding and disourse structural relation recognition between two adjacent arguments. In this field, Discourse refers to a free text occurred in a document which consists of a series of arguments. In general, a pair of arguments are semantically coherent and structurally cohesive. Relation classification between two arguments has been admitted to be an important sub-task. The goal is to automatically determine the specific types of the relations between two arguments.In the Penn Discourse Treebank 2.0 corpus(PDTB v2.0), discourse relation falls into explicit and implicit cases, the basis is that whether there is a connective exists between two arguments. For explicit cases, previous studies had shown that it is easy to determine the discourse relation and the performance has reached 93.09%; while for implicit cases, there isn’t any explicit connective between the Args, though we could not determine the discourse relation directly. In this paper, we focus on studying the implicit discourse relation recognition, especially on the explicit relation clues extraction and advantage integrations of multiple classification methods, the main contents of the research include the following three aspects: 1) Implicit Discourse Relation Inference Based on the External RelationThe discourse relation between two arguments is triggered by the external elements of each argument. On the basis, we propose a novel implicit discourse relation inference approach based on the external relation. The method follows the existing inference pattern that uses the explicit relation to infer the implicit relation. Firstly, we searche the explicit reference arguments that have the similar content with the test arguments in the external data, then we use the standard sorting algorithm to rank the explicit reference arguments. Finally, we predict the implicit discourse relation based on the ranking results. Especially, the method focuses on mining the text fragments which can synergistically trigger the discourse relation between two arguments(called external elements), and predicts the implicit discourse relation of the arguments with reference to the relation between two external elements. 2) Research on Implicit Discourse Relation Datasets Expansion Method for Imbalanced DataThe distribution of current discourse training data is unbalanced in reality that negatively influence the recognition performance. To overcome the problem, we propose a novel implicit training sets expansion method. We introduce the argument vectors to improve the arguments representation. Firstly, we mine reliable discourse relation samples from the external data sets based on the argument vectors; then we add the mined samples to the existing training data sets, thus the training data has been enfriched; finally, we perform experiment on this extended data sets. 3) Stacked Learning for Implicit Discourse Relation Classification OptimizationThe existing discourse relation detection systems have distinctive advantages, such as superior classification models, reliable feature selection, or holding rich training data. This shows the feasibility of making the systems collaborate with each other within a uniform framework. In this paper, we propose a stacked learning based collaborative approach. The approach involves two-level learning processes, base-level and meta-level. In the base-level, it evaluates different well-trained classification systems, while in the meta-level, it regard the suitability of the base-level classifiers as novel features, and use them to equip a superior classifier.
Keywords/Search Tags:Implicit discourse relation, external relation measurement, training sets enrichment, argument vectors, stacked learning
PDF Full Text Request
Related items