Font Size: a A A

The Research Of Mining Frequent Itemset Based On OpenCL

Posted on:2013-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2248330362974346Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology today, the amount of datacreated by people’s daily production and living is showing explosive growth. Therefore,the processing of massive data sets has become the major challenges of data miningtechnology. How to find valuable information in cost-effective way is a new topic ofdata mining.The Mature of GPGPU technology has injected new impetus into the developmentof data mining technology. By a very different path of CPU, GPU had been from adedicated graphics processor step by step into today’s general-purpose computing field,and is challenging the supercomputer of the traditional architecture. Clearly, datamining, such compute-intensive applications will also benefit from the cheap massivelyparallel computing power provided by modern GPU.Association rules is one of the important technology in data mining, and thefrequent itemsets computing tasks is the core of the algorithm, how to use the GPGPUtechniques to accelerate the frequent itemsets mining has certain theoretical andpractical significance. This paper analyzes and summarizes past research on frequentitemsets mining, and then, designs a CPU+GPU heterogeneous algorithm based onOpenCL, using the large-scale concurrent threads created by OpenCL to speed up thecalculation of the computationally intensive part of the Apriori algorithm. Theexperiments use of OpenCL’s Java binding interface to do the concrete realization, andselect the same level CPU and GPU for the comparison test of performance between theoriginal algorithm and improved one. The experimental results show that betteracceleration performance is achieved in sparse data sets, and the lower the min supportis, the higher of speedup. Finally, it is up to about20times. In addition, there arepreliminary discussions and experiments using OpenCL Local Memory mechanism todo further optimization of the transaction data accessing. However, the final test resultsshow that this improvement is only produce about10%performance gain in dense dataset. In the last, this article also pointed out some directions that worthy of further studyand improvement.
Keywords/Search Tags:Data mining, Frequent itemset, GPGPU, OpenCL
PDF Full Text Request
Related items