Font Size: a A A

A Study Of Privacy-Preserving Data Mining Based On Multiplicative Perturbation

Posted on:2013-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q ShiFull Text:PDF
GTID:2248330371981367Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Today, data mining technology has become the main tool for management and business intelligence analysis. But with the applications of data mining, the problem of data leakage often happens so that the user’s privacy can’t be guaranteed. Consequently, the technology of privacy-preserving data mining has become a major research focus, that is, how to protect user privacy data while obtaining high quality results in data mining.Different data mining algorithms call for different privacy-preserving technologies. Decision tree classification algorithm establishes a model according to the distribution of the data. So the application of additive perturbation method is highly effective. Apriori association rules algorithm establishes a model according to the occurrence probabilities of each object item. So the application of random response method is very useful. There is a class of data mining algorithms, such as K-means clustering algorithm, support vector machines classifying algorithm, the main features of these algorithm is that they only need the information about distance and inner product between the datasets to build models. So the application of multiplicative perturbation method is enough, which is also the study focus of this article.The main privacy-preserving data mining algorithms based on multiplicative perturbation are rotation perturbation (RP) and projection perturbation (PP). They rotate the data in a single angle and project the data from high-dimensional space to low-dimensional space respectively. Independent component analysis (ICA) is an effective tool for separation of source signals from mixture signals, can also be used to estimate the user data accurately, which greatly reduces the privacy security of RP. Known knowledge ICA (KK-ICA) is invented in this article, it can be used to estimate the user data from the projection perturbation data accurately, which also greatly reduces the privacy security of PP. This article presents a new privacy-preserving algorithms based on multiplicative perturbation that is partial rotation perturbation (PRP). The data is divided into several independent parts, which is rotated by random orthogonal matrix, so that attackers can’t accurately estimate the user data. That makes PRP more security and having the same accuracy as RP.In experiments, we introduce relative error (RE) and Frobenius relative error (F-RE) as the measurements, and compare the accuracy and accuracy of PRP, PP and RP, finally verify the advantages of PRP. PRP is studied for the data mining algorithms based on distance and inner product. So in the last of this article, we apply it into clustering models and classifying models, and make comparison with the direct use of user data. It make us more intuitive to see how partial rotation perturbation be used in the practical application of data mining.
Keywords/Search Tags:Privacy-preserving data mining, Multiplicative perturbation, Partial rotationperturbation, Rotation perturbation, Projection perturbation
PDF Full Text Request
Related items