Font Size: a A A

Research On Privacy-Preserving Data Mining Algorithms

Posted on:2010-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:C X ZhangFull Text:PDF
GTID:2178360278975507Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, data mining has been widely applied in many fields, including commercial decision-making, scientific exploration and medical research. The data mining brings great benefits to us, but at the same time it will inevitably produce the issue of privacy disclosure. Particularly, people pay more attention to privacy, data mining encounters a series of difficulties in the process of implementation. Privacy-Preserving Data Mining has emerged in this background, it can ensure the safety of the sensitive data and rules, and get the accurate results. It effectively eliminates the contradiction between data mining and privacy protection.First of all, the paper summarizes the present research on privacy-preserving data mining, and then makes some research on the most widely used privacy preserving data mining for association rules:Firstly, analyzes the MASK algorithm based on centralized data.The algorithm achieves privacy preserving data mining for association rules in the methods of data interference and reconstruction of distribution, but the exponential complexity of reconstructing the original support of a set based on the distorted database affects the effectiveness of the algorithm. Against the disadvantage of it, in this paper, an improved optimization algorithm based on collection principle is proposed, which breaks the exponential complexity. At last, the experiment indicates that this method has a better performance than the MASK.Distributed Data Mining is a dynamic process. Specifically, there are two aspects:(1) The addition of some new institutions;(2) With the passage of time,there will be a large number of new records in the original database. At the same time,some of the original records may have been updated, and even has been deleted. The original association rules will be out of date,and it can not accurately reflect the hidding rules or patterns in the current database,so it needs to be updated. The most basic way is to re-mining association rules,but it is costly.A new algorithm PPIUDAR is proposed on the issue of incremental updating for association rules in distributed environment.Through the using of the existing association rules, it realizes the incremental maintenance of association rules efficiently. Because of a lot of secure multi-party computation technologies are applied in the algorithm,it fully guarantees the privacy of each site. At last the experiment shows the algorithm is correct.
Keywords/Search Tags:data mining, privacy protection, association rules, secure multi-party computation
PDF Full Text Request
Related items