Font Size: a A A

Research On Algorithm For Distributed Mining Of Association Rules

Posted on:2009-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:J F GuoFull Text:PDF
GTID:2178360272479593Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining is an important area in KDD, and mining association rules in large databases applies more widely than other methods. Existing algorithms and modules cater to a centralized environment, such as database or data warehouse. With the development of distributed database and network technology, collecting and integrating a large amount of data from Internet sites are not practical ways. To solve the problem, this dissertation researches the mining association rules in distributed databases.First, this paper analyses and introduces the basic concepts and algorithms of mining association rules and mining association rules in distributed databases. And argues about relation among three kinds of different frequent itemsets, proposes mining only the set of maximal frequent itemsets instead of every frequent itemsets. To make an better improvement, an experiment is performed upon existing algorithms of distributed association rules mining and obtains improving strategy and solution. An efficient distributed algorithm of association rules mining based on constrained subtree is proposed. The algorithm for mining global maximum frequent itemsets is defferent from other algorithms which can conveniently get all global maximum frequent itemsets using FP_tree structure by one time mining, and superset checking is very speedy. And can mine all maximum frequent itemsets throught only two times database scanning, then a method of adding prior weight among every sites is adopted to obtain easily global maximum frequent itemsets. Finally, improved algorithm is applied to mine the data about teaching and scientific research of universities, with the purpose of finding out the potential rules in teaching and scientific research to offer some help to teaching activity and scientific research in the following year.
Keywords/Search Tags:Distributed Data Mining, Association Rule, Globally Frequent Itemsets, Globally Maximum Frequent Itemsets, Constrained Subtree
PDF Full Text Request
Related items