Font Size: a A A

Research On Association Rules Algorithm Based On Frequent Pattern Tree

Posted on:2017-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:S ChenFull Text:PDF
GTID:2308330482484164Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,theamount of data is increased rapidly, in order to find out the rules and patterns of the value from these vast amounts of data, the data mining technology emerges as the times require,and hasachieved fruitful results, Association rules mining is very important in data mining,and it is also one of the very active branches.The mining of frequent itemsets is a key in association rules mining,isuesed infinancialbusiness,e-commerce,communicationnetwork, cross shopping and other aspects.frequent itemsets mining algorithms are mostly based on Apriori algorithm at present,which need scanning large-scaledata sets for many times,system I/O becomes the bottleneck of algorithm, and it requires huge CPU resources to generate candidate frequent itemsets.With the development of information technology,data quantity is growing exponentially.The lack of those algorithm in efficiency is the key problem. The existing parallel frequent itemsets mining algorithm is sensitive to the data distribution,and it is easy to appear the phenomenon of unbalanced load, which makes the calculation task allocation not all affect efficiency.In this paper, we first propose a metadata based text data entry method, which is used to deal with the text data,then we introduced a kind of frequent itemsets based on the structure of FP-Tree data parallel mining algorithm calledFP-Forest,the algorithm can use many task node parallel mining frequent itemsets, only need to scan data set twice and do not need to generate candidate set,and thealgorithm in terms of its task node load equilibrium also has good performance,a reasonable balance of the computing resources.Finally,the design of algorithm was experimental, and operation of the algorithm and monitoring the speedup, analysis found that the algorithm is constructed with parallel FP-Tree process saves a lot of time with strong and parallel.
Keywords/Search Tags:frequent patterns, association rule, parallelization, FP-Forest
PDF Full Text Request
Related items