Font Size: a A A

The Research Of Web Log Mining Based On Rough Set

Posted on:2007-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y YangFull Text:PDF
GTID:2178360182998079Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Accompanying with the quick development of the Internet, World Wide Web has already been related to every aspects of social life. Web has developed to be a distributed information space which owns billions of websites and contains knowledge of great and potential value. That people want look for the useful information they need in the rich database, provides new challenge for the research on data mining. Web data mining is a kind of technique combining traditional data mining and Web data, which will gain more attention from all the aspects along with the development of the Internet.The paper mainly includes several aspects:Firstly, the data pretreatment of Web log mining. The Web data is complex and various. First we must determine the research object. The object of Web log mining is not the original data on the Web but the secondhand data abstracted from the interactive process of users and the Internet, which includes the appealed URL, the appealing IP and the time stab, etc. All these logs offer rich information about users' visits. The research focal point of this part in the paper is how to get the characteristics of the visits (such as behaviors, frequency, content, etc. of users' visits) and to establish the data model based on the behaviors of users' visits.Secondly, the research of Rough set. The former way of mining the potential information in the Web Log database is to transform the data into a data model which can be manipulated by the traditional data mining and then manipulate them by data mining technique (such as the algorithm of association rule). Although this way meets the needs of Web mining temporarily, it can not satisfy its dynamic increasing demand. In Rough set, knowledge is considered as an ability of classification, which is the ability of constructing partition in the domain. According to the thoughts of Rough set, the paper researches the dispersiveness of pretreated data, sets up a new data model and improves reduction algorithm and abstracts the static rule of classification ultimately. At the same time, it takes into account the existence of the incoherence rule, and it researches on how to achieve the decision rule on the absent occasion.Finally, the paper makes a conclusion and opens a new prospect to the next step of log mining.
Keywords/Search Tags:Data mining, Web log mining, Rouge set, Decision rule, Pretreatment
PDF Full Text Request
Related items