Font Size: a A A

The Study Of Information Retrieval And Knowledge Discovery Methods From Web

Posted on:2003-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ChenFull Text:PDF
GTID:2168360092492878Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This thesis studies information retrieval and knowledge discovery methods from Web. The main content is as follows:First of all, this thesis studies and absorbs a lot of internal and international research achievements on data mining (DM) and knowledge discovery in databases (KDD); according to the practical problem, it explicates that the application of data mining in Web, including its basic methods, steps, algorithms and facing challenges. As a vast data source, in this thesis, it is a main problem that retrieving information from Web. There are two ways of retrieving information, one is off-line data mining, namely, to the HTML web, we can change the semi-structural data into structural data by the technology of retrieving information and store it into the traditional database, then knowledge will be discovered by the varied database-based algorithms; the other is on-line data mining, that is to combine the technology of retrieving information from web and the algorithms of data mining, then to retrieve knowledge directly.Secondly, it studies the data preprocess and the mining algorithms based classification pattern in order to find knowledge. At the aspect of preprocess, some preprocess methods are studied and improved, including rough set, data clustering, concept hierarchies and language field, etc. At the aspect of mining algorithms, classification is an important knowledge discovery method. It can use a simple model to predict the class of new sample. Moreover, because Web is a typical dynamic data source, this thesis studies how to construct the decision tree and the classification rules in the dynamic data environment.Finally, based of the study of theory and methods, a system that retrieving the information and mining knowledge from agriculture product price has already developed.
Keywords/Search Tags:data mining (DM), knowledge discovery in databases (KDD), information retrieval, Wrapper, data preprocess, classification pattern, rough set, decision tree, agriculture product price
PDF Full Text Request
Related items