Font Size: a A A

Research On Frequent Itemsets Mining Parallel Algorithm

Posted on:2018-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:R D ShenFull Text:PDF
GTID:2348330542953172Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With in-depth development of social information,a large number of data that increasingly shows the features of huge number,complex structure,numerous types,valuable and so on,is generated and accumulated.Exploring useful knowledge and mining valuable information in the data have always been the main work of traditional data mining.Moreover,mining frequent itemsets is one of the most important part of it.In the face of mining massive data,conventional hardware architecture and software environment have been difficult to meet people's requirements.To solve this problem,more and more attention has been paid to parallelization technology.The traditional frequent itemsets mining algorithm mainly includes Apriori-like algorithm,Eclat-like algorithm and FP-growth-like algorithm.There have been many academic researches on the improvement of the three classical algorithms.In recent years,in the era of big data,how to parallelize the traditional frequent itemsets algorithm becomes the hotspot of data mining.In this paper,we focus on the parallelization of FP-growth algorithm in frequent itemsets mining algorithm.Two improved FP-growth parallel algorithms based on GPU are discussed respectively.First,the paper discusses the parallel algorithm of frequent itemsets mining,and makes a brief introduction to the GPU hardware architecture and CUDA programming environment.Then,in order to adapt to the parallel computing framework of CUDA GPU,the paper analyzes the main process of mining frequent itemsets in FP-growth algorithm.The construct tree process and the recursive mining process in FP-growth algorithm are improved by parallel.FPNR-growth algorithm stores the FP-tree information into the FP-array;the FPBR-growth algorithm maps the FP-tree to the binary base tree BR-tree.In the process of recursively mining frequent itemsets in FPBR-growth algorithm,we try to use core idea of TD-FP-growth algorithm and optimize the recursive efficiency by using CUDA's Dynamic Parallelism technology.In the parallel design and implementation of the algorithm,we use GPU and CPU to work together to play their own advantages,and make full use of GPU parallel computing capability to improve the performance of the algorithm.Finally,in the experimental test phase,four different data sets with different data characteristics are selected to test the performance of the two improved algorithms,and the experimental results are compared with the traditional serial FP-growth algorithm.The experimental results show that the parallel FP-growth algorithm based on GPU has greatly improved in the performance of the algorithm.
Keywords/Search Tags:Frequent itemsets mining, FP-growth, GPU, CUDA, Parallel computing
PDF Full Text Request
Related items