Font Size: a A A

Research On Application Of Association Rules Mining Algorithm In Web Log Mining

Posted on:2012-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:L H FuFull Text:PDF
GTID:2218330338462906Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the popularization of network in people's daily lives, more and more insti-tutions and individuals are offering and searching for information on the internet. So there are abundant data on the web, which providing a wealth of resources for data mining. On the other hand, the characteristics of this data also pose great challenges for data mining. And these challenges promote research on applying data mining techniques to web data.Web data mining is the use of data mining or machine learning to automatically discover and extract information from web documents and services. Web log mining is the application of data mining techniques to discover usage patterns from web usage data, in order to understand and better serve the needs of web-based applica-tions.This thesis studies applications of association rules in web log mining. First of all, the paper introduces the meaning of web data mining, the process of web data mining and web data mining categories. Then we discuss web log data collection and pre-processing, especially some essential task in pre-processing such as data cleaning, user identification, session identification and path completion. In addition, techniques which are applied to web log mining and applications of web log mining are described in this thesis. Then, we discuss the association rules mining. The thesis firstly gives a description of basic concepts of association rules, and then Apriori algorithm and Ec-lat algorithm. Particularly, we go into details on ideology, implementation, advantage and limitation of Apriori and Eclat. Based on the analysis of two algorithms, an im-provement of Eclat algorithm is presented. And then, we test performance of im-proved algorithm by experiment with a variety of data sets. Experiments show that the improved algorithm is more suitable for sparse data sets.Finally, based on theory of web log mining and association rules mining algo-rithm, a web log mining prototype system is presented in this thesis, also experiment on the system with NASA_HTTP data sets is made.
Keywords/Search Tags:Web Data Mining, Web Log Mining, Association Rules Mining, Apriori, Eclat
PDF Full Text Request
Related items