Font Size: a A A

The Research Of Automatic Text Summarization Technology On Internet

Posted on:2011-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z M HuFull Text:PDF
GTID:2178330332964388Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rapid development of Internet provides a huge amount of resources for people, also promotes the technology of information processing. As the exponentially increase of user's accessible online information , the traditional processing and management techniques on text data cannot be satisfied to the various demands of users any longer. The information retrieval and summarization are the most crucial technologies, while information retrieval is an effective way to acquire the required information, and summarization can reduce the burden of reading, help people extract the main relevant information. It favors information retrieval and post-processing, the simplicity and clarity of whose performance is an effective means for information mining.The primary research can be summarized as follows:Firstly, this paper summarized the process of the development of text summarization, and analyzed the technology of web text summarization. For web text, this paper proposed a method to filter noise from webpage. It can eliminate tag and most ads of one page, and remain useful information in the text. The theme extraction method can eliminate information which irrelative to page theme.Secondly, this paper realized web text summarization system. This system includes four functional models: web text content extraction, pre-processing, text clustering and summary generation. The DOM tree parsing method was used in the model of web text content extraction successfully. It is to extract the content we need from web. The other three models were to generate summary.Thirdly, 1000 web pages are downloaded from Internet to experiment, the experimental result shows, this system is efficient, and it can improve the performance of summary effectively.
Keywords/Search Tags:web text summarization, text clustering, summary sentence ordering
PDF Full Text Request
Related items