Font Size: a A A

Methods of Sentence Extraction, Abstraction and Ordering for Automatic Text Summarizatio

Posted on:2018-05-21Degree:M.ScType:Thesis
University:University of Lethbridge (Canada)Candidate:Nayeem, Mir TafseerFull Text:PDF
GTID:2478390020456757Subject:Information Science
Abstract/Summary:
In this thesis, we have developed several techniques for tackling both the extractive and abstractive text summarization tasks. We implement a rank based sentence selection which can retain the most important and non-redundant contents to form the summary. For ensuring a pure sentence abstraction, we propose several novel sentence abstraction techniques which jointly perform sentence compression, fusion and paraphrasing at the sentence level. We also model abstractive compression generation as a sequence-to-sequence (seq2seq) problem using an encoder-decoder framework, which is also a novel inclusion according to the state-of-the-art text summarization systems. We propose simple yet effective solutions to several common problems in neural seq2seq models such as redundant repetition and unknown token replacement. Our sentence level models improve the informativity as well as the grammaticality of the generated sentences. Furthermore, we applied our sentence abstraction techniques to the multi-document text summarization. We also propose a greedy sentence ordering algorithm to maintain the summary coherence for increasing the readability. We introduce an optimal solution to the summary length limit problem. For the sentence level tasks, we conduct our experiments on human generated abstractive compression datasets and evaluate our system on several newly proposed Machine Translation (MT) evaluation metrics. In the case of the document level summary, we conduct experiments on the Document Understanding Conference (DUC) 2004 datasets using ROUGE toolkit. Our experiments demonstrate that the methods bring significant improvements over the state-of-the-art methods. At the end of this thesis, we also introduced a new concept called "Reader Aware Summary" which can generate summaries for some critical readers (e.g. Non-Native Reader).
Keywords/Search Tags:Sentence, Text, Abstraction, Summary, Methods, Several
Related items