Research On Clustering Algorithm Based On Differential Privacy Protection

Posted on:2019-12-16

Degree:Master

Type:Thesis

Country:China

Candidate:C Li

Full Text:PDF

GTID:2428330578972832

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Big data has an important application in all walks of life.In the era of data king,mastering the data is to master the weight of winning.All kinds of enterprises pay more and more attention to the function of data.At ordinary times,the seemingly inconspicuous data information,after the analysis of data mining,will find some important and valuable.Therefore,the next step of operation and management will play a guiding role in forecasting.This is the meaning of data mining.However,all the data in the final analysis are human data,so it is necessary to cover the personal data while mining the potential association of these data,so how to ensure the privacy of the data and prevent personal information leakage is an important problem.In the face of this phenomenon,how to protect the user data in the process of information mining has become a major research direction for the present privacy protection.In many privacy protection methods,the difference privacy can be measured by the mathematical basis and the privacy level.The combination of data mining can effectively ensure that the data will not be excavated.And divulge privacy.The data mining algorithm based on differential privacy protection is studied and discussed in the following aspects:(1)An R-neighborhood distance outlier algorithm is proposed.The outliers are detected by the distance ratio,and then the data set is divided into several parts,which is beneficial to the selection of the initial center points of the DP K-means algorithm in the post.Experiments show that the outlier algorithm has great time advantages in ensuring effective detection of outliers,and is suitable for application and clustering algorithms.(2)An improved algorithm of outlier elimination DP-K-means(DP-ODK-means)is proposed.K-means based on differential privacy needs to improve data privacy while ensuring the availability of results.The algorithm optimizes the randomness of the initial cluster center selection.According to the improved distance based outlier detection method,the initial cluster center is selected according to the sub set of the density division.The cluster efficiency is increased and the Laplace noise is added to the original data to protect the original data.Experiments show that this method satisfies the differential privacy requirement and preserves the availability of data clustering.(3)A DP-MCDBScan algorithm based on differential privacy is proposed.The DP-DBScan method,which combines the differential privacy technology,can effectively solve the information security problem in the data set clustering process,and can effectively deal with the data set with certain noise.The DP-MCDBScan algorithm is an improved algorithm for the DP-DBScan algorithm.By optimizing the method of selecting the core points,the clustering accuracy is improved when the privacy protection budget is low,while the time cost is reduced,and the impact of the initial random selection on the clustering is reduced.

Keywords/Search Tags:

Differential privacy protection, Data mining, DP-MCDBScan, Outlier elimination, availability

PDF Full Text Request

Related items

1	Research On Outlier Detection Algorithm Based On Differential Privacy Protection Model
2	Differential Privacy Protection And Implementation Of Frequent Subgraph Mining
3	Real-time Data Privacy Protection With Adaptive ω-event Differential Privacy
4	Research On Enhanced Differential Privacy Protection Technology For User Sensitive Data
5	Differential Privacy Based Data Privacy Protection And Its Application
6	Research On Frequency Estimation And Frequent Itemset Mining For Local Differential Privacy Protection
7	Research On Privacy Protection Algorithm For Association Rules Mining
8	Research And Application On Privacy Protection For Data Mining
9	Research On Data Privacy Protection Method Based On Differential Privacy Mechanism
10	Research On Data Publishing And Mining Method Based On Differential Privacy