Font Size: a A A

Protect The Privacy Of Data Mining Research

Posted on:2013-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:S T XieFull Text:PDF
GTID:2218330374461926Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The development of both network technology and database technology makes humanity can accumulate and process massive data in various fields. How to extract valuable knowledge from these data has attracted many researches'attention. Data mining makes one can extract useful information and hidden law from massive data. In most cases, the data to be mined are stored in different place, possessed by different people, or different companies or different organizations. Some data owners may want to cooperate with others to perform data mining on complete data set they owned to find out useful information or hidden laws, but they do not want to disclose their private data. This is what we call privacy-preserving data mining which is of important practical significance.The goal of privacy-preserving data mining is to make multi-parties can cooperatively perform data mining on the data they owned to obtain the result the want without knowing other parties'private data, that is, participants'private data would not be disclosed in data dining process. There are three main privacy-preserving data mining technologies:random perturbation technology, data anonymous technology and secure multi-party computation technology. This dissertation focuses on the latter. Secure multi-party computation refers to, in a mutual distrust setting, n (n>/2) parties can perform a cooperative computation to obtain desired result without disclosing their private data. Our main works are as follows.First, we reviewed the state of the art of data mining and of secure multi-party computation. Second, we elaborated the relation between data mining and secure multi-party computation. Finally, we summarize privacy-preserving clustering algorithms. Two novel privacy-preserving data mining algorithms are proposed based on the present algorithm. One is privacy-preserving K-Means clustering algorithm based on homomorphism encryption. Another is privacy-preserving K-Means clustering algorithm based scalar product protocols for semi-honest parties. These two algorithms have lower communication overhead, lower computational complexity, and higher accuracy.
Keywords/Search Tags:Data Mining, Distributed Data Mining, Secure Multi-partyComputation, K-Means Clustering, Homomorphic Encryption
PDF Full Text Request
Related items