Font Size: a A A

Research Of Document Summarization Based On Topic Analysis

Posted on:2009-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:K M NieFull Text:PDF
GTID:2178360245499998Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The automatic summarization technology is a natural language processing topic. It automatically produces the article's summarization which can basically reflect the article's original information. With the popularization of the internet, the network has become a huge information resource, So, Automatic summarization technology can save users'time of searching useful information.Threshold of the topic partition method based on statistics must be appointed a value, the value can be designated through many experiment. But the designated threshold short of adaptability. A topic partition method based on automatically gain threshold is introduced, and compared with the method of appointing threshold through experiment. The experiment proved that the method is effective; it can satisfy the needs of automatic summarization and make up the shortfall of the method of appointing threshold through experiment.Because automatic summarization is mining the potential knowledge from some documents related to one theme, document clustering technology is got widely attention. K-means document clustering method has a linear time complexity. However, its cluster center is difficult to choose. A center choice method based on sub-graph division is presented. The experiment results show that this method is effective. Compared with the traditional methods, it improved the F-measure value of the cluster result.Multi-document can be divided into several local themes. Then we extract information from these local themes. Correspondingly, the summarization can have a high coverage. A multi-document theme formation technology based on single document topic partition is presented and four theme formation methods are compared. After anglicizing the local theme of the multi-document, the summarization can be format on basis of extracting sentence from these local themes. At last, comparisons of multi-document method based on single document topic partition and sentence clustering is given.
Keywords/Search Tags:Automatic summarization, Document clustering, Manifold ranking, Multi-document summarization
PDF Full Text Request
Related items