Font Size: a A A

The Research Of A Web Hot Event Mining And Characterization

Posted on:2011-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2178330338990040Subject:Systems Science
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology, information dissemination methods have completely changed. In particular, the popularity of Internet technology, making the network information as the primary means of access to information. However, during the popular network information, with the explosive growth of information networks, the user how to obtain the required data from the mass of information has become increasingly difficult.Currently, people access to information for the network mainly depend on search engines, but now is confined to the search engines use keyword matching to find the relevant information. Not only have many independent redundant information, and ways to obtain this information requires a certain prior knowledge, so many hot events will not be mastered by the user timely. Although each site will be hot news rankings in a period of time, because of this hot degree ranking is based on the basis of manual work, so there is a large degree of subjectivity, making hot events between the major sites are not the same rank. Therefore, based on actual demand, there is a need to be able to automatic, accurate, real-time extraction of hot events of the technology.Based on Internet massive text information, a web hot event mining and characterization had been researched. This paper designs a program for mining hot events on the network and implements an experimental system. The main work includes:1. The model and algorithm for mining event had been analyzed. Compared with the events mining process model and algorithm of Web text extraction, text preprocessing and text clustering, the concepts of the event is defined and the event mining system design is given.2. The events mining based on secondary clustering is proposed. By analyzing the existing algorithms in the topic detection and tracking and document clustering in detail, the events mining based on secondary clustering is proposed, which can effectively reduce the computational complexity and improve the accuracy of mining results.3. The event feature extraction algorithm based on local topic sentence group is improved. By analyzing the advantages and disadvantages of the algorithms of automatic abstracting, an event automatic abstracting algorithm based on the local topic sentence extraction is proposed for summarizing the hot events on the network, which can improve the readability of the events mining results.4. The events mining algorithms and programs are analyzed by using experimental results. A system for mining hot events on the network is designed and implemented. Experimental results show that the effectiveness of the program algorithm which can be used in many fields.
Keywords/Search Tags:events mining, topic detection and tracking, document clustering, event relevant multi-document summarization
PDF Full Text Request
Related items