Font Size: a A A

Performance Optimization Technique Of Paralle Apriori Algorithm

Posted on:2011-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z X XuFull Text:PDF
GTID:2178330338989607Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of computer and internet, a mass of data have been created and collected. There is a great challenge we face that how to use these data. And the enterprise also need to mining and analysis large amounts of data to collect accurate and useful information. And which makes association rule mining algorithm is particularly important. Because the traditional parallel Apriori association rules algorithm have some inherent defects, so it is hard to solve such problems and effect poor performance. So in this paper, we focus on the research of performance of parallel Apriori algorithm optimization techniques.In this paper, we analyze the traditional serial and parallel Apriori algorithm, and focus on the parallel algorithm in terms of logic process and physical implementation of optimization techniques. Based on the research, our main contribution is in the following fields.1.Logical process on the parallel algorithm optimization techniques. We propose two optimization technology to solve the defects of candidate sets size and unbalance by analyzing the logical process of the parallel alogrithm. And also propose the rule generation optimization by combining with the Trie tree. Through sufficient experiments, we prove that these optimization techniques can effectively reduce the number of candidate sets size and reduce the load unevenly, and to improve the rule generation efficiency.2.Physical implementation on the parallel algorithm optimization techniques. We propose a optimization technology to solve the defect of database scan times by analyzing the physical implementation of the parallel alogrithm. And also propose the memory assignment optimization by combining with the Trie tree. Through sufficient experiments, we prove that these optimization techniques can effectively reduce the database scanning times and improved memory utilization.Based on the theoretical research above, we design a parallel system of rule mining Apriori algorithm. System includes the database module, the load balancing module, the rules mining module and, rule generation module and so on. The system can analyze and mine all the rules we want, and provide a basic platform for the related experiments and research of the optimization techniques.
Keywords/Search Tags:association rule mining, Apriori, parallel, optimization
PDF Full Text Request
Related items