Font Size: a A A

Research And Development Of Making Summarization System Automatically Based On Statistic

Posted on:2009-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:J M TanFull Text:PDF
GTID:2178360278470994Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Our society today is facing the information explosion, due to the frequent augmentation of network information flow brought by the development of the Internet Technology as well as social progress. Every day a great deal of information appears in front of people in the form of electronic documents. Hence, it has been an urgent problem that how to figure out the required information and its main ideas from the whole pool of information, and how to achieve the fast-reading of the information created everyday. In fact, people can not reach the goal by browsing all the electronic data so that the traditionally artificial means of information processing has been far from inadequate. Thus, it's of high emergency that we need information compression and selection technologies to abstract and concentrate mass information, and the automatic summarization technology is the very powerful one to solve it.Automatic summarization technology is of importance research content in the area of natural language processing. It is aimed at exploring the thinking mechanism of obtaining and extracting information from the natural language passages, based on which systems writing summaries of passages automatically are developed, so as to improve the efficiency of information retrieval and transmission. Besides, abstracts are concise and cohesive short passages that reflect the central ideas of essays, and better meet the requirements of obtaining information in comparison to the index. Although researches on the abstract technology are still in its initial stage, but the importance of this technology should not be underestimated, since it will be widely used in the area of information processing in the future.At present, automatic summarization of the main methods are divided into four major automatic abstract methods: understanding-based abstracting, automatic extraction, information extraction, and structure-based abstracting.In this passage, based on the current research, a system is developed by the combination of methods to dealing with natural languages and a widely used programming language. Algorithms on participle and calculation of weighted value are adopted. Taking into account the speed and quality when actually applying, formulas with weighted value for selecting key words and key sentences are given, and the summarizing sentences are extracted by the original excerpts of passages. This method is simple, and can be applied to non-restricted areas.
Keywords/Search Tags:automatic summarization, natural language, participle, weighted value, the key sentence
PDF Full Text Request
Related items