Research Of Automatic Summarization Based On Named Entity

Posted on:2010-07-26

Degree:Master

Type:Thesis

Country:China

Candidate:D An

Full Text:PDF

GTID:2178360278460856

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Along with the Internet popularization, the network has become a huge information resource. While the huge amounts of information provides us with the facilitate information, it also brings us the effective information getting problem. For some important news, too much latest news is reported in many web sets, it is good for information accessing, but at the same time, facing to tens of thousands of news documents, which are same or similar in meaning, it is difficulty for us to get the main idea of these news documents, it would take us too much time and energy to read and analyze.Automatic summarization abstracts simple and important information from documents on a specific subject. News automatic summarization is an application of multi-document summarization, which can help us to grasp the news'general quickly. There are there main difficulties in multi-document summarization, they are sentences for summarization selecting, redundancy excluding, and sentences ranking. A method of news automatic summarization based on named entity is proposed in this article. In this method, we find out these important news factors according to identifying and counting the named entities in news documents, we exclude the redundant sentences according to calculating their similarities, and then we rank these selected sentences according to the time information in them.System of automatic summarization based on named entity is a realization of the new method which is proposed in this article. In this system, the first of all, named entities, such as time, location, person and organization, which present important factors of news, are identified and picked out from the news documents, and then they are counted. The weights of each sentence are calculated according to the frequency of the named entities, the position of the sentence and the length of the sentence. Several sentences are chosen according to their weights as a preliminary set of sentences for summarization. At this time, the redundant sentences will be excluded by calculating their similarities between each other. Finally sentences are sorted by time information in them. Experimental results indicated that the new system was effective and applicable in practice.

Keywords/Search Tags:

Automatic Summarization, Multi-Document Summarization, Named Entity, Vector Space Model, Sentences'Similarity

PDF Full Text Request

Related items

1	Research On Automatic Multi-document Summarization Based On Statistics And Semantic Analysis
2	Multi-Document Automatic Summarization Based On The Term-Sentences—Document Tri-layer Graph Model
3	Design And Implementation Of Multi Document Automatic Summarization System In Biomedical
4	Automatic Summarization Of Multimedia Information And Related Technology Research,
5	Evaluation Method Research Of Automatic Summarization Calculating The Similarity Of Text Based On HowNet
6	Research Of Document Summarization Based On Topic Analysis
7	Research On Chinese Automatic Summarization System
8	The Approach For Event-based Multi-document Automatic Summarization
9	Research Of Web Multi-document Automatic Summarization
10	Research On Chinese Automatic Summarization And Its Evaluation Method