Research on distributed data mining system and algorithm based on multi-agent

Posted on:2010-10-28

Degree:M.Sc

Type:Thesis

University:Universite du Quebec a Chicoutimi (Canada)

Candidate:Jiang, Lingxia

Full Text:PDF

GTID:2448390002485116

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

Data mining means extracting hidden, previous unknown knowledge and rules with potential value to decision from mass data in database. Association rule mining is a main researching area of data mining area, which is widely used in practice. With the development of network technology and the improvement of level of IT application, distributed database is commonly used. Distributed data mining is mining overall knowledge which is useful for management and decision from database distributed in geography. It has become an important issue in data mining analysis. Distributed data mining can achieve a mining task with computers in different site on the Internet. It can not only improve the mining efficiency, reduce the transmitting amount of network data, but is also good for security and privacy of data. Based on related theories and current research situation of data mining and distributed data mining, this thesis will focus on analysis on the structure of distributed mining system and distributed association rule mining algorithm.;Key words: data mining, distributed, Association rule, multi-agent , RK-tree algorithm;This thesis first raises a structure of distributed data mining system which is base on multi-agent. It adopts star network topology, and realize distributed saving mass data mining with multi-agent. Based on raised distributed data mining system, this these brings about a new distributed association rule mining algorithm---RK-tree algorithm. RK-tree algorithm is based on the basic theory of twice knowledge combination. Each sub-site point first mines local frequency itemset from local database, then send the mined local frequency itemset to the main site point. The main site point combines those local frequency itemset and get overall candidate frequency itemset, and send the obtained overall candidate frequency itemset to each sub-site point. Each sub-site point count the supporting rate of those overall candidate frequency itemset and sent it back to the main site point. At last, the main site point combines the results sent by sub-site point and gets the overall frequency itemset and overall associtation rule. This algorithm just needs three times communication between the main and sub-site points, which greatly reduces the amount and times of communication, and improves the efficiency of selection. What's more, each sub-site point can fully use existing good centralized association rule mining algorithm to realize local association rule mining, which can enable them to obtain better local data mining efficiency, as well as reduce the workload. This algorithm is simple and easy to realize. The last part of this thesis is the conclusion of the analysis, as well as the direction of further research.

Keywords/Search Tags:

Data mining, Algorithm, Frequency itemset, Main site point, Each sub-site point, Multi-agent

PDF Full Text Request

Related items

1	Based On The Distance Education Web Site Information Collection And Data Mining Technology Research
2	Research Of Self-adapting Distance Education Web Site Based On The Web Usage Mining
3	The Design&Realization Of Real-time Site Present System
4	Site Selection Optimization Of 5G Network Base Station Based On Weighted Minimal Modular Ideal Point Method
5	PagePrompter: An intelligent agent for Web navigation created using data mining techniques
6	Research On Some Key Technologies In Web Site Summarization
7	Research And Implementation Of Multi Site High Efficiency Navigation Algorithm
8	Research And Scheduling Algorithm Implementation For The Site-based And Resource-constrained Project Scheduling Problem
9	Research On Key Technologies Of Beidou Third Generation Multi Frequency Point Satellite Signal Acquisition
10	Multi-level Mutli-dimesional Frequent Itemset Mining