Study On Multi-Level Association Rules Mining Algorithm Based On FP-Tree

Posted on:2006-01-16

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Hu

Full Text:PDF

GTID:2168360155472726

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Data mining is a process to reaveal latent and interesting knowledge from massive data, and an effective approach to solve the problem of "rich data and poor knowledge". Association rules mining can reveal interesting correlations between itemsets from massive data. It is an important subject of data mining and is widely used in real life. According to the number of hierarchies involved in rules, association rules can be classified into single-level association rules and multi-level association rules. Compared with single-level association rules, multi-level association rules can provide richer and more generalized knowledge, and meet with more users'requirements, so it is of great practical value to study Multi-level association rules mining. The proposed algorithms for mining multi-level association rules such as Cumulate algorithm, ML_T2L1 algorithm are based on Apriori algorithm. These algorithms still adopt "candidate generate and test"method to get frequent patterns which cause large cost in two aspects: (1) I/O cost. In these algorithms, the database is needed to scan k times (k is the largest length of the candidates), which results in large I/O cost; (2) Computing cost. These algorithms may generate a large number of candidates whose frequency is checked by pattern matching, which leads large computing cost. So these algorithms are inefficient. FP_Growth algorithm is an efficient algorithm for mining single-level association rules. It does not generate candidates and only scans database two times. FP_Growth algorithm overcomes the shortcomings of Apriori algorithm effectively, thus the efficiency of FP_Growth algorithm improves a lot compared with Apriori algorithm. By expanding FP_Growth algorithm, the author proposes a highly efficient algorithm MLAR-FP for mining multi-level association rules. The following measures are applied to expand FP_Growth algorithm: (1) Expand every transaction by adding all ancestors of each item during the process of scanning the database. This measure ensures that we can get multi-level association rules; (2) Delete the ancestors that are not frequent items in time to compress search space and enhance the efficiency of mining; (3) Avoid generating redundant frequent patterns. To verify the correctness and efficiency of MLAR-FP algorithm, the author conducts an experiment on a sales database of a medicine company, and compares it with Cumulate algorithm. The results of the experiment show that MLAR-FP algorithm gets the same result as Cumulate algorithm does and inherits the strongpoint of high efficiency of FP_Growth algorithm. Because MLAR-FP algorithm uses divided conquering approach to get frequent patterns, it has the potential of parallel computing. According to this feature, the author develops parallel MLAR-FP algorithm based on message passing interface under the environment of cluster of workstation. The parallel policy of this algorithm is data parallel and parallel mode is master/slave mode. Considering the difference of each node's capability, this algorithm balances the load of all computing nodes by distributing data to each node dynamically.

Keywords/Search Tags:

Multi-level association rules, FP_Growth algorithm, MLAR-FP algorithm, Parallel computing

PDF Full Text Request

Related items

1	Intelligent Recommendation Of Tourist Attractions Based On Association Rule Mining Algorithm
2	Research And Applications Of Improved FP_Growth Algorithm Based On E-government
3	Research For Association Rules Algorithm On Big Data
4	Parallel Association Rules Algorithm Based On Hadoop
5	An Improved Association Rule Algorithm With The Characteristic Of Dynamic Added Weight And Its Application In Fault Management Of Telecom
6	Research On Distributed Association Rule Algorithm In Data Mining
7	The Research On Association Rules Mining And Algorithm
8	Profit-Analyse Based Multi-level Association Rules Research
9	The Research Of Multi-dimensional And Multi-level Association Rule Algorithm
10	The Study Of Parallel Algorithm For Mining Association Rules And Application In Medicine Selling System