Font Size: a A A

Research On A Fast Algorithm For Mining Closed Frequent Itemsets And Building Their Lattice

Posted on:2009-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2178360272474943Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of economy, information becomes increasingly important for a country and enterprises. People are drowned in information, but get little useful knowledge. Therefore data mining technologies have emerged and shown strong vitality.Association rules mining, producing interesting association or correlation relationships among a large set of data items, is an important sub-branch of Data Mining. Association rules are considered interesting if they satisfy both the minimum support threshold and the minimum confidence threshold. Association rules mining has become a hot research topic in recent years, and be applied widely to various fields, such as finance, marketing decision analysis and business management etc. In the association rules mining, the algorithms are the core. Conventional algorithms of association rules mining rely mainly on mining frequent itemsets, but they are complicated. Pioneer researches have mentioned that the traditional association rules mining framework produces too many rules, but the extent of redundancy is a lot larger than previously suspected. And the set of all closed frequent itemsets can be orders of magnitude smaller than the set of all frequent itemsets, especially for dense datasets, and don't lose any information at the same time. However, in order to speed up the pace of producing association rules, only getting the closed frequent itemsets is not enough, we need to use an effective data structure to store the relationship between itemsets, the lattice is such a structure. So mining closed frequent itemsets and their lattice algorithms is one of the important research fields of association rules mining.The main works of this thesis include: 1)concludes the shortcomings of some current mining association rules algorithms and makes a deeply analysis on the latest CHARM algorithm and CHARM_L algorithm, a closed frequent itemsets and their lattice mining algorithms. 2)introduces the concept of preC to overcome the shortcomings, such as inefficiently eliminates redundancy and creates lattices, of CHARM_L, and an adapted algorithm: Q-CFIsL. Q-CFIsL inherits CHARM_L's optimization strategy, and adopts some new methods to overcome the shortcomings of CHARM_L; Based on a vertical data structure, Q-CFIsL uses IT_tree's linearly eliminating redundancy to build the closed frequent itemsets lattice, this method combines the process of building mining closed frequent itemsets and creating lattice, the experimental results show that Q-CFIsL outperforms the latest similar algorithm CHARM_L. 3)generates minimal association rules from closed frequent itemsets lattice; gives mining minimal association rules algorithm and compares with all association rules and proves the effectiveness of the minimal association rule.
Keywords/Search Tags:association rule, closed frequent itemsets, closed frequent itemsets lattice
PDF Full Text Request
Related items