Font Size: a A A

Association Rule Mining Algorithm In The Grid Environment

Posted on:2011-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:X T WuFull Text:PDF
GTID:2208360308471880Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Grid is a distributed computing platform based on Internet, can integrate a variety of computing resources, and convert the resources into a kind of widely available, reliable and economical computing power. It has distributed, heterogeneous, sharing, dynamic and virtual and so on. Data mining is a non-trivial process of discovering implicit, previously unknown, potential useful patterns or information from incomplete, noisy, fuzzy, random data set. Association rule is one of the main research contents of data mining, reflects the dependency and relevance between things and has broad application. In this paper, the distributed mining algorithms of association rule are studied by using the grid as a distributed computing platform. The main study works are as follows:First, a frequent pattern mining algorithm (GridDMF) based on the grid is presented. The local frequent item sets are independently mined in each node, and merged into the candidate global frequent itemsets. Then, the candidate itemsets are pruned, and broadcasted to other nodes. The local counts of the global candidate items are collected for the final global frequent itemsets by scanning the database. By pruning candidate itemsets, the communication cost between all nodes, and the calculation of the itemsets are reduced, thereby the overall mining efficiency is improved. In the end, the experiments show the validity and effectiveness of the algorithm by using star spectral data set.Second, a distributed algorithm of constructing FP-tree (GridDBMA) based on the grid is presented. At first, the global item head table is made, then, the local frequent pattern tree (BFP-tree) is constructed independently according to the order of the item head table in each node. The merge- algorithm is used to unite the local frequent pattern trees into a global tree, which could extract the global frequent item sets. Because of the improving the traditional storage structures of frequent pattern tree, the size of the tree and the communication between nodes are reduced, the traversal of tree is more convenient and effective, and the mining efficiency of frequent item sets is improved. In the end, the experiments show the validity and effectiveness of the algorithm by using star spectral data set.
Keywords/Search Tags:Distributed data mining, Grid, Association rules, Minimum support, Star spectral data
PDF Full Text Request
Related items