Font Size: a A A

The Study And Application Of Web Data Mining Using In Information Monitor System

Posted on:2008-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:2178360242472477Subject:Computer technology
Abstract/Summary:PDF Full Text Request
This thesis is composed according to the real experience during the net information monitor system is developed for net information monitor sections.With the growth of Web information, the harmful information is speeding up too. In order to solve the problem urgently for information monitor sections, it is necessary to build a monitor system that can be used to search the harmful information availably in web pages and to clean up the net environment quickly. The related techniques and methods that can be used for designing and developing a net information monitor system are discussed deeply in this thesis.Based on analyzing the shortage of the existing information retrieval tools, some data mining techniques which are used to search the web information are discussed in this thesis. Then, some techniques and methods of collecting web information, storing web information, building text categorization model, filtering the harmful information according to the text categorization model, training the text categorization model to be improved and so on, are discussed in detail in this thesis. And then, the PageRank algorithm which is used in the web structure mining, the text categorization model, including Chinese departing words, vector space model, supporting vector method, which is used in text mining, the KNN method which is used in the text auto-categorization algorithm are also discussed deeply. Finally, as an application example, an actual system is built and the result of running the system is also discused in detail.As a matter of fact, the means and ways which are put forward in this thesis are very viable. It can improve the ability of searching harmful information in high precision and recall, and it can be also popularized to other areas to discover other web information, and it will also make web data mining techniques have more and more useful functions.
Keywords/Search Tags:Data Mining, Text Mining, Information Retrieval, Text Categorization Model, Vector Space Model
PDF Full Text Request
Related items