Font Size: a A A

Research On Neural-networks Based Extractive Summarization System

Posted on:2019-10-08Degree:MasterType:Thesis
Country:ChinaCandidate:H P ZhaoFull Text:PDF
GTID:2428330566498104Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic text summarization is the process of generating a concise representation of original text while retaining the core information.Summarization algorithms can be broadly classified into two categories: extractive and abstractive.Extractive approaches aim to select salient words,phrases or sentences from the original text while the abstractive methods focus on rewriting the content without the constraint of reusing words or phrases from the original text.Automatic summarization can aid many downstream applications(e.g.,news digests,social media).Recently,neural networks based data-driven approaches have become popular for modeling the extractive summarization task.A few recent approaches conceptualize extractive summarization as a sequence labeling task.Another problem is the discrepancy between training and testing,in which during the test time,we treated it as a ranking problem.Thus we present a regression model to solve it.Our model learns to score sentences to fit ROUGE during the training.Our regression model outperforms than other SOTA models when generating short summary.But there is no gain when generating long summary.In many scenarios,we will face the problem of summarizing a large number of highly redundant options.Consider the two sentences in a summary: "what is the price of this T-shirt ?" and "how much is the T-shirt ?".The two different sentences covey the same meaning.In this paper,we provide a empirical analysis about the redundancy problem.Many existing extractive summarization systems usually model the sentence importance and sentence redundancy in two separate processes,namely sentence scoring and sentence selection.In this paper,We present a redundancy-aware ranking model.We utilize a greedy algorithm to model sentence importance and redundancy simultaneously.We use neural networks to model the representations of sentence and document without any hand-crafted linguistic features.Experimental results show that our model outperforms state-of-the-art extractive systems on Daily Mail news highlights datasets.
Keywords/Search Tags:neural networks, extractive summarization, ranking model, redundancy
PDF Full Text Request
Related items