Font Size: a A A

Research Of Web Multi-document Automatic Summarization

Posted on:2011-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:H Y FuFull Text:PDF
GTID:2178330332460341Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The Political Party Diplomacy Auxiliary Decision Supporting System is an intelligent system for clustering-searching, which can find the massive document-sets about the same subject by inputing keywords, and show the contents of automatic summarization so that the user can glance over the information fast and make the correct decision promptly. The automatic summarization is an important part of this system, and a research on this subject is proposed to further optimize the system.The Web multi-document automatic summarization is for the purpose of presenting the comprehensive and concise information to the users, which has saved the users'browsing-time.At present, two kinds of methods have been used about the multi-document automatic summarization.First, sorting unifily the entire document-set' s sentences according to the weight, and choosing the summarization sentences in turns according to the compression ratio; Second, dividing the document-set into several partial subjects, then choosing the summarization sentences from the different partial subjects. In view of the fact that the users require comprehensive and concise summarization, this paper has studied the second kind of methods with emphasis.This paper has studied several aspects of multi-document automatic summarization with emphasis: similarity computation, partial subject division, summarization sentences optimal selection, and summarization sentences sorting.This paper has improved the summarization sentence optimal selection and sorting method based on the partial subject division through the deep research and analysis on above several aspects. It mainly includes: Improved the computational method of semantic distance between words and words,and proposed the computational method of sentence similarity based on euclidean distance and semantic distance;Optimized the k- central point algorithm which can discover the seeds and category number based on sentence density intelligently; Improved the scoring method on partial subject and the judgement method on sentence information coverage fraction, thus optimized the iterative and optimal summarization sentence selection strategy; Proposed the improvd three -rank ordering method based on two-rank ordering method. Finally, applied the algorithms in the web multi-document automatic summarization system, and has carried on the experiments and the result analysis about the algorithms.
Keywords/Search Tags:Multi-document automatic summarization, Sentence similarity, Partial subject, Summarization sentence
PDF Full Text Request
Related items