Font Size: a A A

Research On K-means++ Clustering Algorithm Based On Laplace Mechanism For Differential Privacy Protection

Posted on:2019-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z D LiFull Text:PDF
GTID:2428330572966432Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development and application of connected information technology,more and more Internet products,including online shopping,dating,medical,entertainment video,large online website platforms,and small mobile APP applications,are deeply rooted in our lives every day.Among them,these are almost all based on the data information of people in all aspects,and these applications or Internet products also depend on people's information to survive,these applications are used for data mining analysis,but in the process of data analysis and mining If used improperly,it may cause user privacy to leak,which will pose a threat to users' information security.Therefore,how to protect privacy in the data mining process is a hotspot in the field of data mining.Traditional universal privacy protection models such as k-anonymity are based on packet implementation,and their drawback is that if the attacker knows enough background knowledge,the attacker can obtain the user's real privacy data through analysis.Therefore,Dwork et al.first proposed a strict and provable privacy protection model in 2006,which defines a very strict attack model.Even if the attacker already knows all the data except the target data,the differential privacy mechanism can still reach Very good protection,to ensure that the target data will not be leaked.In addition,the amount of noise added by differential privacy is independent of the size of the data set,which is very beneficial for large-scale data mining and analysis.The accuracy of the traditional differential K-means clustering algorithm is greatly influenced by the K-means algorithm's initial center point selection.The K-means++ clustering algorithm is chosen to optimize the initial center point.The privacy DPK-means++ clustering algorithm solves the problem of randomly selecting the privacy point of the initialization center point.The differential privacy-based DPK-means++ clustering algorithm can effectively provide different levels of data privacy protection under the premise of privacy budget parameters and ensuring clustering accuracy.Spectral clustering is a clustering technique based on graph theory.This paper combines DPK-means++ clustering algorithm and applies it to spectral clustering algorithm,and proposes a spectral clustering algorithm based on DPK-means++.The clustering algorithm provides privacy protection and better accuracy for the non-convex data clustering process,and achieves a good balance between this two algorithms.
Keywords/Search Tags:privacy protection, Differential privacy, Data mining, Clustering, Spectral clustering
PDF Full Text Request
Related items