Association Rule Mining Algorithm In The Grid Environment

Posted on:2011-10-13

Degree:Master

Type:Thesis

Country:China

Candidate:X T Wu

Full Text:PDF

GTID:2208360308471880

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Grid is a distributed computing platform based on Internet, can integrate a variety of computing resources, and convert the resources into a kind of widely available, reliable and economical computing power. It has distributed, heterogeneous, sharing, dynamic and virtual and so on. Data mining is a non-trivial process of discovering implicit, previously unknown, potential useful patterns or information from incomplete, noisy, fuzzy, random data set. Association rule is one of the main research contents of data mining, reflects the dependency and relevance between things and has broad application. In this paper, the distributed mining algorithms of association rule are studied by using the grid as a distributed computing platform. The main study works are as follows:First, a frequent pattern mining algorithm (GridDMF) based on the grid is presented. The local frequent item sets are independently mined in each node, and merged into the candidate global frequent itemsets. Then, the candidate itemsets are pruned, and broadcasted to other nodes. The local counts of the global candidate items are collected for the final global frequent itemsets by scanning the database. By pruning candidate itemsets, the communication cost between all nodes, and the calculation of the itemsets are reduced, thereby the overall mining efficiency is improved. In the end, the experiments show the validity and effectiveness of the algorithm by using star spectral data set.Second, a distributed algorithm of constructing FP-tree (GridDBMA) based on the grid is presented. At first, the global item head table is made, then, the local frequent pattern tree (BFP-tree) is constructed independently according to the order of the item head table in each node. The merge- algorithm is used to unite the local frequent pattern trees into a global tree, which could extract the global frequent item sets. Because of the improving the traditional storage structures of frequent pattern tree, the size of the tree and the communication between nodes are reduced, the traversal of tree is more convenient and effective, and the mining efficiency of frequent item sets is improved. In the end, the experiments show the validity and effectiveness of the algorithm by using star spectral data set.

Keywords/Search Tags:

Distributed data mining, Grid, Association rules, Minimum support, Star spectral data

PDF Full Text Request

Related items

1	Researches And Applications On Association Rules Mining With Multiple Minimum Supports
2	Minimum Support Association Rule Mining
3	Analysis Of Student Achievement Based On Data Mining
4	Association Rules Mining Algorithm And Its Application On Telecommunication Industry
5	Research On Data Flow Association Rule Mining Algorithm Based On Sliding Window
6	Distributed Data Mining Based On Grid Services
7	Multicast-based Distributed Association Rule Mining Algorithm
8	Research And Optimization Of Association Rules Based On Can Tree
9	Mining System, Based On The Constraint Concept Lattice Stellar Spectral Data Classification Rules
10	Research And Improvement Of Algorithm For Incremental Updating Association Rule In Retail Business Intelligence System