Font Size: a A A

Research Of Automatic Summarization Oriented To News Text

Posted on:2006-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:H T LiuFull Text:PDF
GTID:2178360185463353Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Since 90's of the 20th century, the Internet progresses rapidly all over the world. The information resources on the Web become more and more abundant. However, it causes the problem of"information explosion", that is, the information is very rich but the knowledge is oppositely lacking. People need to look for a path urgently which can help to acquire the wanted information efficiently. As we know, automatic text summarization is able to compress a text, reduce customer's burden, and provide support to other text handling techniques, therefore, it becomes an important focus of researches.It is a changing world where we live in. News occurs every minute. Thus, an urgent problem we are facing is how to store, retrieve and mine this information in the form of text.This paper studies the automatic summarization technique used for news text. Incorporating the fruits of former researchers, this dissertation handles the following problems, feature analysis and extraction of news text, automatic summarization algorithm for single and multiple news texts. This work includes the following aspects:1. This paper probes deeply into the features of news text such as the structure and the semantics, and clarifies the main factors of news text summarization. Then, it proposes a new framework for automatic news text summarization. Finally, the paper discusses some key techniques in the proposed framework.2. This paper studies automatic summarization of single news text. A recognition algorithm for new words is proposed in this paper. It discusses the problem of feature extraction and importance decision in detail. New features of words and sentences are given. And feature item weights computation method is improved. Smoothing and polishing of the produced summary are also discussed.3. This paper studies automatic summarization of multiple news text. According to the characteristics of news reporting of correlated events, the paper proposes a time-axis based automatic summarization approach for multiple news text. It tracks events at different time and at the same time automatically. It incorporates semantic analysis to eliminate the influence of synonyms upon the sentence similarity. The paper adopts a hierarchical clustering method to determine the local topic of news text set. And it gives a new scheme for the determination of cluster number. Finally, production of multiple news text summaries is studied.4. This paper also designs and implements the automatic news text summarization system. Experimental results show that the proposed summarization technique is fairly good and the system can meet the current application requirements.
Keywords/Search Tags:Automatic Text Summarization, Feature Extraction, Evaluation of Text Summarization, Vector Space Model, Clustering, Local Topic Determination
PDF Full Text Request
Related items