Font Size: a A A

Research On Automatic Summarization System Based On RSS

Posted on:2013-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y LiuFull Text:PDF
GTID:2248330371468278Subject:Information resource management
Abstract/Summary:PDF Full Text Request
With the increasing amount of information, it’s valuable to figure out how to retrieve information and obtain its summary. Search Engine and the "PUSH" technology of RSS offering the "Source" of information has not addressed the issue of the "quantity" of information. Automatic Summarization technology is one of the best ways to deal with the information overload.This article assumes that documents with different topics should have different features combination models, thus automatic classification is the prerequisite of the automatic summarization procedure. After the construction of a self-build classification corpus, four features selection algorithms have been used with the classification algorithm Simple Vector Distance to finish automatic classification. Two measures for the evaluation of summary sentences have been proposed in this article: Probability and Possibility. Based on the summary corpus, machine learning algorithms including Linear Regression and Logistic Regression have been applied to construct the optimum features combination model of the summary sentences. This article proposes ROUGE-CN algorithm to deal with Chinese text.The experimental comparison results show that, the combination of automatic classification methods and machine learning algorithms based on regression statistics improves the quality of machine-generated Chinese news summaries.Innovation of this paper is the combination of online RSS feeds and automatic summarization technology based on machine learning. An automatic Summarization System Based on RSS Feeds has been implemented in the end. The system obtains news text from online RSS feeds, extracts metadata using regex matching, provides users with various options, and then generates the class label and summary.
Keywords/Search Tags:Automatic Summarization, Machine Learning, Automatic Classification, Regression Analysis, Automatic Summarization Evaluation
PDF Full Text Request
Related items