Font Size: a A A

Research Of Chinese Text Automatic Summarization Based On Conceptual Vector Space Model

Posted on:2006-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:M WangFull Text:PDF
GTID:2168360152995237Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the information available on the World Wide Web is growing exponentially, the information-overloading problem has become a significant problem. Such problem can be reduced by text summarization technology, but it is a time-consuming task for human professional to conduct the summarization processing. Due to the large volume of information available on line in real time, the research of automatic summarization becomes very critical.In general, automatic summarization is defined as the process that the abstract of a document is generated automatically by utilizing computer, also it is viewed as one of NLU (Natural Language Understanding)'s important applications. It saves the information time because of its simplicity and high speed. Automatic Abstract is of great difficulties and challenges; it is looked as one of the standards to test a machine's intelligence, so people have worked on it for many years. Nowadays, an automatic summarization system can't perform complete analysis of grammar, semantics and pragmatics within the limit of related research, and it only generates indicative abstract.Aiming at the present situation, this paper uses HowNet as a tool to obtain concepts, and conceptual statistical method is applied to research on automatic textual summarization. The main works are introduces as follows:1. We propose a method that is word's concept is obtained by using HowNet;2. We establish the conceptual vector space model by replacing word frequency with word's concept as feature, and carry out weight of sentence and reduce redundancy of it to obtain summarization.3. We Construct a system of Chinese automatic summarization based on conceptual vector space model.In order to evaluate the system of automatic summarization based on concepts obtained, we using two different methods: intrinsic evaluations and extrinsicevaluations. Comparing with the traditional summarization system based on word frequency, the evaluation result proved that the proposed algorithm is more efficient and robust.
Keywords/Search Tags:Automatic Summarization, Conceptual Vector Space Model, HowNet, Nature Language Understanding
PDF Full Text Request
Related items