Font Size: a A A

Research On Clustering Algorithms In Differential Privacy

Posted on:2020-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:C HuFull Text:PDF
GTID:2428330590995552Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has led to the massive amount of data generated by Internet economic activities such as e-commerce,instant messaging,and online services.Organizations such as many companies are better able to analyze the key value information hidden in these data and apply the analysis results to business strategy,decision analysis,etc,data mining came into being.While people enjoy the convenience of data mining,the disclosure of a large amount of sensitive information brings many threats and losses to users,and the problem of data privacy leakage becomes more and more serious.Therefore,how to protect data privacy in the data mining process has become a hot issue in the field of data mining and privacy protection.Differential privacy,with its rigorous mathematical definition and provability,has become a new type of privacy protection technology which attracts much attention and has been widely studied in recent years.The existing differential privacy data mining work mostly concentrates on association rules and classification algorithms,while the research on clustering algorithms for differential privacy protection is relatively rare.The research content of this thesis focuses on the clustering problem for differential privacy protection.There are three main aspects:In view of the previous differential privacy k-means clustering algorithm,it is sensitive to the selection of initial centers point,which reduces the availability of data.A new optimized differential privacy DPk-means-up clustering algorithm is proposed.The algorithm reduces the number of iterations by selecting the appropriate initial center point and improves the availability of clustering results.Theoretical analysis and comparative experiments were carried out.Theoretical analysis reveals that the algorithm satisfies ?-differential privacy and can be applied to data sets of different sizes and different dimensions.In addition,the experimental results illustrate that the proposed algorithm effectively improves the availability and performance of clustering results compared with other differential privacy k-means clustering methods under the same level of privacy protection.In light of the previous problems in the differential privacy spectrum clustering algorithm,the choice of scale parameters will have a greater impact on the results and the number of clusters needs to be specified in advance.A new optimized differential privacy adaptive spectral clustering algorithm is proposed.The algorithm can select the k value that maximizes the feature interval as the most suitable cluster number;it can also automatically calculate the scale parameter to better reflect the intimacy relationship between samples;in addition,the DPk-means-up algorithm proposed above is used.It replaces the k-means algorithm used in traditional spectral clustering to improve the usability and accuracy of differential privacy spectrum clustering results.Theoretical analysis and experimental results show that the proposed optimization algorithm improves the accuracy and usability of clustering results to a large extent compared with the traditional differential privacy spectrum clustering algorithm.For the sake of verifying the effectiveness of the DPk-means-up algorithm in practical applications,this thesis selects the group recommender system as the application scenario,and introduces the DPk-means-up algorithm into the group recommender algorithm to ensure that user privacy is not leaked during the group recommender process.The experimental results demonstrate that applying the DPk-means-up algorithm to the group recommender can better balance the privacy protection level and the accuracy of the recommender results.
Keywords/Search Tags:privacy preserving, differential privacy, data mining, k-means, spectral clustering, clustering algorithm, group recommender
PDF Full Text Request
Related items