Private-Preserving Distributed Data Mining System

Posted on:2005-07-04

Degree:Master

Type:Thesis

Country:China

Candidate:X C Shen

Full Text:PDF

GTID:2168360122981239

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the coming of information era and rapid development of computer network technology, how to mine efficiently knowledge from data under distributed environment becomes a new topic in information science research areas. Association rule mining is an important task of data mining. At present main challenge is in efficiency and memory power. Developing distributed mining algorithms is a better choice. So, in this thesis, we focus on research on distributed mining associations rules. The following is our main research directions: Data can be too large to be loaded into memory at once. Data can be confidential. Customers are willing to provide only the analysis result from data ,not the data themselves. Data can be distributed.The research of distributed data mining is just at its starting stage. Many problems need to be solved. Among them, the system architecture and algorithms of distributed data mining are the most important. This paper makes some interesting exploration in these two directions. Firstly, a distributed data mining system is proposed ,which mines knowledge from large amounts of distributed data sets. Since this system transfers only the intermediate result of local data mining, it greatly decreases the network traffic and enhance the security and privacy of data. The system use CORBA as the distributed software engine, so it does not depend on any particular programming languages, computing platforms. Then, some new ideas and good implementation techniques for distributed data mining algorithms are proposed based on this prototype system. In this paper, we mainly discuss association rule mining and improve the conventional algorithm in two different methods in order to adapt to the distributed/parallel data mining. One is from rules to rules: associationrules are firstly mined at the local sites, and then global association rules are generated from these local rules. The Other is from data to rules: the local sites exchange their intermediate data results, and then global association rules are generated from these results. In this paper, we proposed a new algorithms using the latter methods. With the new algorithms , we can discover frequent item set with minimum support level, without revealing the information of the customers. At last, we draw some conclusions and outline directions for future work.

Keywords/Search Tags:

data mining, association rules, distributed, private-preserving

PDF Full Text Request

Related items

1	The Research Of Privacy-preserving Distributed Association Rules Mining Algorithm
2	Research On Privacy Preserving Algorithms For Association Rules Mining In Distributed Environment
3	Research On Algorithm Of Distributed Privacy-preserving Mining Of Association Rules
4	Research On Privacy-preserving Association Rules Mining In Distributed Environment
5	The Method Research Of Mining Association Rules In Distributed Environments
6	The Research Of Distributed Privacy Preserving Data Mining Based On Intelligent Agent
7	Research On Distributed Association Rules Min-Ing Algorithm And Its Applications
8	Research On Mining Technology Of Association Rules And Meta-Rules
9	The Research Of Association Rules Mining Based On Privacy Preserving Techniques
10	The Binding Association Rules In The Distributed Environment Mining Algorithm And Implementation