Font Size: a A A

Research On Parallel Algorithms For Mining Association Rules

Posted on:2009-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:D Y WangFull Text:PDF
GTID:2178360245471549Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Mining association rules from large databases is an important problem in data mining. It becomes nearly impossible to process large databases on a single sequential machine, for both time and space reasons. It is urgent to develop parallel algorithm for this problem. In this dissertation, algorithms for mining association rules are studied on two types of parallel computer architecture. The content of the dissertation is as follows:1. Research for mining association rules on distributed storage systemAiming at solving the problems in traditional parallel methods, MRPD, aparallel algorithm with multi-transmitting redistributed data, is proposed and its correctness is proved in theory. In MRPD, data is redistributed into some groups at step 1, and all the groups are multi-transmitted according to the request of computer nodes. Each node will compute frequent itemsets asynchronously after having received one full group, and finally, all frequent itemsets are collected. Through experiment, the algorithm is compared to traditional algorithms in different conditions of data distribution.2. Research for mining association rules on shared storage systemIn this dissertation, on the basis of the research of the apriori algorithm, two parallel algorithms running on SMP machine, HA-1 algorithm based on hash-table and HA-2 algorithm based on local database, are proposed and are compared to traditional algorithms for their performance. Some drawbacks in parallel mining algorithms on distributed storage system, such as overload of data transportation and low parallel degree, are elementarily overcomed.
Keywords/Search Tags:association rules, parallel algorithm, data mining
PDF Full Text Request
Related items