Font Size: a A A

Web Log Mining Closed Frequent Itemsets

Posted on:2011-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y C YanFull Text:PDF
GTID:2208330332977056Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapidly development of information industrialization, the application of database deepening quickly,Data Mining has recently become the hotspot.Web log mining is a very important direction and domain in Web data mining area,through analyzing and finding out the rules in the web log,it could discover the rule and pattern of the potential Web users,it could be widely used in discovering the potential customers in e-business and improve the quality and efficiency of the enterprise information portal. However, the traditional association rules-based Web log mining methods are often based on frequent itemsets, such methods often produce a large number of candidate rules, and the existence of a large number of redundant rules,which is a challenge for time and main memory space.As we know, the number of frequent closed itemsets is much smaller than frequent itemsets, and through frequent closed itemsets can get all the frequent itemsets, frequent closed itemsets generated by association rules can get all the rules.In this regard, a new algorithm CFIs_Webmining which based on frequent closed itemsets(CFIs) is advanced in this paper.The algorithm produced the frequent closed itemsets lattice structure based on CHARM_L and extraction of the smallest association rules on the basis of the structure. CFIs_Webmining can solve some problems appered in the past algorithms of association rule mining.Firstly, this thesis expatiates on its research background and research status at home and abroad of Web data mining,and summarize the data mining,Web data mining and Web log mining.Secondly,We research the whole process of Web log preprocessing. Then We focuses on the association rules and the classical algorithms Apriori and CHARM in access patterns of Web log mining research. And then introduced the concept of closed frequent itemsets, and the CHARM algorithm which can efficient mining frequent closed itemsets, as well as CHARM_L algorithm which based on the CHARM algorithm to generate the frequent closed itemset lattice structure.We also introduces the concept of minimal association rules, the latter two who are the important parts of composition of our CFIs_Webmining.We proved the CFIs_Webmining algorithm is Effective through a large number of experiments in this paper.Finally, We take the campus web-log mining of ZhouKou Normal University for a data source, using the proposed CFIs_Webmining algorithm. After data pretreatment on the log file,we extracted valuable rules and put forward recommendations and methods for improving the site.
Keywords/Search Tags:Web log, Web log mining, association rule mining, frequent closed itemsets, lattice structure, minimal association rules
PDF Full Text Request
Related items