Font Size: a A A

Research And Implementation Of Parallel FP-Growth Algorithm Based On Cluster Of PC

Posted on:2012-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:D P WangFull Text:PDF
GTID:2248330395455394Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
FP-Growth algorithm is currently the most widely used algorithm for miningfrequent itemsets, and it does not need generate candidate itemsets. It just need twicescannings of the the original database, and all the information of items in database willbe compressed to a data structure named FP-tree, then the problem of mining frequentpatterns in the database turned into mining the FP-tree. However, in dealing with themassive data, the sturuct of FP-tree resulted is extremely complex, then generatingfrequent itemsets and mining it to strong association rules requires a large memory anda fast processor. The parallel algorithm separate the computing task into sub-tasks thenassigned some of them to each nodes in the cluster to reduce the load when onecomputer is overload, so the parallel FP-Growth algorithm for mining frequentitemsets is very important and significance in applications.After deeply investigating the theory of parallel computing, high performancecomputing cluster and FP-Growth algorithm, this paper shows the parallel computerarchitecture, the parallel algorithm designing methods, the architecture of highperformance computing cluster technology and the FP-Growth algorithm steps. Inorder to improve the efficiency of the parallel FP-Growth algorithm, this thesisanalysises some typical parallel FP-Growth algorithms, the result shows that thesealgorithms were primarily run on the homogeneous-hardware parallel computingplatform, and them did not take the condition of heterogeneous-hardware into the loadbalancing steps. Therefore, they may performance poor on the heterogeneous-hardwareparallel computing platform. For achieving the parallel FP-Growth algorithm, ahigh-performance computing cluster will be designed and implemented, and thehardware of this cluster’s nodes are heterogeneous. the parallel FP-Growth algorithmdesigned and implemented in this cluster which only has three compute nodes run well,and the speedup is over2.3.
Keywords/Search Tags:FP-Growth algorithm, Cluster, Parallel Computing, Association Rules, Heterogeneous-hardware Computing Platforms
PDF Full Text Request
Related items