Font Size: a A A

Machine learning in automatic text summarization: From extracting to abstracting

Posted on:2007-03-10Degree:Ph.DType:Dissertation
University:University of Illinois at ChicagoCandidate:Xie, ZhuliFull Text:PDF
GTID:1448390005965453Subject:Computer Science
Abstract/Summary:
Automatic text summarization is one tool which is used by people to deal with information overload. In this research, we focus on providing a machine learning framework for automatic text summarization. Linguistic techniques are still indispensable. However, they are not the central part of this dissertation research. By implementing the machine learning framework, we remove the burden of searching for suitable linguistic techniques to produce better summaries when the evaluation standard is changed. Under the novel framework, the summarizer learns from human-written summaries directly. Under our new framework, the redundant work of defining a "gold standard" for sentence ranking is also removed. Therefore, researchers can pay more attention to the fundamental aspects of automatic text summarization, such as better evaluation metrics and methods.; This dissertation research consists of two distinct stages. In the first phase of our research, the summarizer aims at producing extractive summaries which on average have the best fitness to a certain evaluation metric, based on learning of optimal sentence ranking functions.{09}We show that based on this framework, the summarizer can easily adapt to a new evaluation metric. In the second phase of our research, we investigate how to produce quasi-abstractive summaries, which is a new type of machine-generated summaries without using whole sentences or clauses from the text. This new type of summaries aims at solving one problem in existing extractive summaries, relatively low similarity to human-written abstracts. We addressed local coherence in the quasi-extractive summaries at the same time. Our experiments show that our approach is successful.
Keywords/Search Tags:Automatic text summarization, Machine learning, Summaries
Related items