Font Size: a A A

Improvement And Parallel Processing Of Association Rules Algorithm On Data Mining

Posted on:2017-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:J F DongFull Text:PDF
GTID:2348330482484834Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The main purpose of mining association rules which is a main aspect in data mining is to find correlations between the data. Association rule mining encounter bottlenecks in dealing with large data sets. Analysis of the reasons causing association rule mining algorithm inefficient, association rule mining algorithm Apriori and FP-Growth mining process conducted in-depth research.Apriori algorithm based on the candidate set when mining association rules require multiple scans databases, take a lot of I/O resources, reduce the efficiency of the algorithm. FP-Growth algorithm without candidate set only needs to scan twice database during operation when dealing with huge amounts of data.FP-Tree storage operations will consume a lot of memory. The main contents about two algorithms are as follows:First, no candidate set and boolean matrix are used to compress data storage transaction in improved Apriori algorithm. So it can connect different matrix base on different condition. And it generates a new frequent item sets. In this way can it reduce the time of scan the databases.Second, propose a new algorithm named TFP-Growth algorithm to solve the problem of FP-Growth in dealing with massive data. The algorithm use difference set and the pruning technique to deal with data of vertical storage transaction data mining. It can effectively reduces memory consumption.Third, a new solution is provided to solve the problem of massive data. On the subject of the excellent framework is CPU/GPU system. The improved Apriori algorithm and FP-Growth algorithm parallel algorithm are designed on this system. The algorithms implemented on parallel system can implement complex logic processing in CPU. The parallel processing part is implemented in GPU.Different characters of the algorithms are used to improve the performance of the improved algorithm. The improved Apriori algorithm and FP-Growth algorithmare proved to be improved and save a lot of memory space in the process of running.
Keywords/Search Tags:data mining, association rules, Apriori algorithm, frequent pattern growth algorithm
PDF Full Text Request
Related items