Font Size: a A A

Research And Realization Of Parallel Algorithm For Mining Frequent Closed Itemsets

Posted on:2009-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:C YuFull Text:PDF
GTID:2178360272474946Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Association rules mining is one of the most important branches in data mining. Its task is to discover the complete set of strong rules satisfying both the minimum support and confidence thresholds. During the past few years, with widespread and success in ares such as marketing and decision aiding, association rules mining has been a popular issue in data mining research industry.Mining frequent pattern is the key of association rules mining and also it is the focus of association rules study. There are three kinds of frequent patterns being proposed for mining association rules: frequent itemset, frequent closed itemset and maximal frequent itemset. At the beginning, association rules were generated through frequent itemsets. However the scale of frequent itemsets is usually huge, which greatly affects the efficiency of the association rules mining. Since the set of closed frequent itemsets can be orders of magnitude smaller than the complete set of frequent itemsets, and it can derive the whole set of frequent itemsets, also the association rules generated by frequent closed itemsets can derive whole association rules. Therefore, mining association rules through frequent closed itemsets rather than frequent itemsets has become more common recently. However, to accelerate the geneation of association rules, only mining frequent closed itemsets is not enough. The relationship among itemsets should be kept in some efficient data structure,such as frequent closed itemset lattice. Through the lattice, we can easily identify the supsets and subsets of any itemsets, so that we can generate association rules more efficiently.Parallel technology is usually effective to improve the efficiency. As the datasets in many real-world problems are huge,parallel technology is introduced in association rules mining and receives extensive and thorough study. However the proposed parallel association rules mining algorithms are mainly based on parallel frequent itemsets mining, whilst parallel frequent closed itemsets mining algorithm is rare. This thesis holds a general discussion on the association rules mining; Analyzes the current research both within the country and abroad; Analyzes the advantages and disadvantages of the typical frequent itemsets, frequent closed itemsets and parallel frequent itmesets mining algorithms; Proposed an effective parallel algorithm P-CHARM for mining frequent closed itemsets, and an effective parallel algorithm Q-CFIsL for mining frequent closed itemsets and building their lattice. P-CHARM algorithm has two versions, P-CHARM I and P-CHARM II. P-CHARM II is an improvement of P-CHARM I. P-Q-CFIsL is a parallel algorithm based on Q-CFIsL which is an e?cient algorithm presented by us for mining frequent closed itemset and their lattice. Eexperiments show that P-CHARM and P-Q-CFIsL are both effective.
Keywords/Search Tags:Association Rule, Frequent Itemset, Frequent Closed Itemset, Frequent Closed Itemset Lattice, Parallel Algorithm
PDF Full Text Request
Related items