Research And Application Of Association Rules Mining Based On Fp-growth Algorithm

Posted on:2007-09-20

Degree:Master

Type:Thesis

Country:China

Candidate:X P Liu

Full Text:PDF

GTID:2178360185965994

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Mining association rules from large datasets, which is one of the most important research fields in data mining, can reveal the interesting relationships between itemsets, therefore is widely applied to many fields such as marketing and sales, medicine, finance, biology, telecommunications, agriculture. Since 1993 R.Agrawal and R.Srikant firstly proposed the concept of association rules, a lot of algorithms have been developed for mining association rules.Fp-growth algorithm is one of the currently most popular algorithms for mining association rules without candidate generation. However, it has disadvantages such as lower space utilization rate and slower execution time when mining the large datasets. To overcome these drawbacks, based on the Fp-growth algorithm, this paper proposed two new algorithms for mining association rules from large datasetsâ€” New- Algorithm 1 and New- Algorithm 2.These two algorithms adopt different strategies to divide the large datasets into many subsets, and then, carry out constrained frequent itemsets mining for each subset. To divide the large datasets into subsets, the new algorithm 1 scans the large datasets for the same times as the total number of frequent 1-itemsets, and then, constructs a corresponding subset at each scan; the new algorithm 2 firstly divides the large datasets into datalists which contain the information of transactions in datasets, and then, divides the datalists into subsets in a way of deleting the first item in the first datalist and adding the remaining items into the other datalists, and then, repeating the same process for the second datalist and so on.Experiments have been conducted to compare the proposed algorithms with the Fp-growth algorithm. Experimental results show that the new algorithm1 and new algorithm2 have advantages such as lower memory usage, and therefore, are faster than the Fp-growth algorithm when the minimum support is low or the datasets is very large. Experimental results also show that the new algorithm 2 is faster than new algorithm 1 because of the lower execution time on creating subsets.In this paper, these two new algorithms are described firstly, and then an application is used to illustrate how to find association rules from the large datasets by these two new algorithms.

Keywords/Search Tags:

constrained frequent itemsets mining, data mining, association rule, candidate generation, Fp-growth

PDF Full Text Request

Related items

1	Research And Implementation Of An Algorithms For Mining Constrained Maximum Frequent Itemsets
2	Efficient Mining Of Association Rules In Distributed Database System
3	Research Of Association Mining
4	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System
5	Research On Algorithm For Distributed Mining Of Association Rules
6	Study On Frequent Pattern Mining Algorithms And Pruning Strategies
7	Association Rules Candidates To Support The Study Of The Frequency
8	Research On He Algorithm About Mining Association Rule
9	Frequent Itemsets Mining Algorithm And Its Application In Data Flow
10	Research On Algorithms For Mining Maximal Frequent Itemsets