Font Size: a A A

Algorithm Design And Implementation Of Multi-core Parallel Association Rule Mining Environment

Posted on:2015-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:B Z ZhangFull Text:PDF
GTID:2268330431956582Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the field of data mining association rules mining is the core of the researchdirection, it can be found that some potential rules from massive data, help thedecision-maker to make decision. However, the frequent itemsets mining associationrules mining is the most time-consuming part, the speed will directly affect theefficiency of mining association rule mining and data mining. At the same time, thedevelopment of multi-core hardware technology and the popularity of multi-coreprocessors, multi-core parallel software technology development has become aninevitable. Therefore, high performance multi core parallel frequent itemsets miningalgorithm design is more important.Based on the classic serial frequent itemsets mining optimization algorithm andmulti core parallel based on the thorough study of the theory, proposed two kinds ofkernel and line frequency frequent itemsets mining algorithm are as follows:Firstly, a new multi core parallel frequent itemsets mining algorithm PIBTalgorithm based on Apriori. Firstly, the block of transaction database parallelcompression of construction of BitTable, a index array using the transverse positionvector, the mining without generation of candidate itemsets and directly get frequentitem set, and take advantage of the longitudinal position vector, without repeatedlyscanning the transaction database can compute the frequent item set support thedynamic allocation policy; mining tasks, so that each thread load as much as possibleto achieve balance; in addition, each thread between independent mining to reduce theread/write conflict. By means of the improved Apriori algorithm and otherpolynuclear comparison on the running time of the parallel algorithm has a higherfeasibility.Secondly, a new multi core parallel frequent itemsets mining algorithm PCT-PRO algorithm based on FP-growth. Firstly, the block of transaction database parallelcompression processing, construction of the improved Global CFP-tree on FP-tree, andthe establishment of the frequent item LFP-Tree. In mining frequent itemsets, withoutthe need to produce a large number of conditional pattern bases and conditional FP-tree,as long as the frequent item LFP-Tree traversal can get all frequent itemsets; dynamicallocation strategy assigns mining tasks, so that each thread load as much as possible toachieve balance; in addition, the line between independent mining process to reducethe read/write conflict. Comparative study of FP-growth algorithm and other nuclearat run time, can get the parallel mining algorithm with high efficiency.
Keywords/Search Tags:association rules mining, frequent itemset mining, multi core parallel, Apriori, FP-growth
PDF Full Text Request
Related items