Font Size: a A A

Research On Algorithms For Distributed Mining Of Association Rules

Posted on:2007-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:S Y WeiFull Text:PDF
GTID:2178360185977190Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the increasing application of database and network technology, many distributed databases are produced. It is a great challenge topic of mining the useful knowledge from distributed databases for decision-making. Distributed data mining is practical for use in many fields such as finance, telecommunication, insurance business, market analysis, anomaly detection, network security, science decision, and so on. Association rule mining is one of core data-mining tasks and has attracted tremendous interest among researchers. This paper studies on distributed mining and updating of association rules with item constraints, distributed mining fuzzy quantitative association rules, mining association rules in distributed XML datas, pruning and clustering discovered association rules and visualizing association rules. The main contributions of the paper are listed as follows.(1) Introduce a concept of inducted set, the fast algorithm DCAR for mining constrained association rules in distributed systems are proposed, which includes efficient algorithms CLF and CGF for distributed mining frequent itemsets that satisfy the boolean constraint. It provides a new method for mining interesting association rules in distributed environment.(2) Propose an algorithm DUCAR to update constrained association rules in the cases including insertion, deletion in the distributed databases.(3) Propose an algorithm DFAM for automatically generating global fuzzy sets and their corresponding membership functions based on distributed clustering in the distributed database, and then discuss an algorithm DFAR for distributed mining fuzzy quantitative association rules.(4) Present an efficient mining algorithm FreqtTree for discovering all frequent patterns from XML data, and then consider mining global frequent patterns from XML data in distributed environment.(5) In order to overcome common problem in association rule mining that a large number of rules are often generated from the databases, propose an algorithm ADRR for pruning the discoveried association rules by removing those redundant rules , and an algorithm ACAR for generating the clustering structure suitable for exploratory analysis.(6) Introduce a novel method ARVir which ingeniously improving parallel coordinates technology to visualize association rules. A system for visualizing association rules based on ARVir is implemented with Java3D.(7) Develop a prototype system DDMINER for distributed association rule mining. Algorithms presented in this paper are implemented and tested. The experiment results show the algorithms are effective and efficient.
Keywords/Search Tags:Distributed Data Mining, Association Rule, Globally Frequent Itemsets, Item Constraints, XML, Visualization
PDF Full Text Request
Related items