Font Size: a A A

Outsourcing Computation Of Privacy Preserving K-means Clustering Algorithm Based On Secure Multiparty Computation

Posted on:2016-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiuFull Text:PDF
GTID:2308330503451187Subject:Computer technology
Abstract/Summary:PDF Full Text Request
K-means algorithm is one of the typical data mining clustering algorithms. With the development of economic globalization and the deepening of cooperation, clustering algorithm data source has been becoming more and more diverse. Considering the K-means clustering of data points will have come from multiple data owners. In order to ensure that the privacy of multiple data owners have did not leak in the calculation process, we have taken measures to protect data from the owner’s privacy data. At the same time, the privacy protection of data will have brought a large amount of computation to the data owner. It has needs to be outsourced to the server to reduce the amount of the data owner.Considering the need of two aspects, this paper has combined K-means clustering algorithm and outsourced computing, and achieved privacy through encryption and achieve ciphertext computing through secure multiparty computation. Multiple data owners have encrypted the data and uploaded to outsource server. Then, server has calculated on the ciphertext and return clustering results to data owner. Most of the computation has been delivered to server and the data owners have just did small computation. In the realization of the clustering effect, it ensures the clustering process data owners’ private data that will not be compromised. This issue mainly has two difficulty: one is to achieve privacy protection K-means clustering algorithm computing outsourcing; the other is the calculation K-means algorithm ciphertext under the two-party coalition as well as multi-data set.Outsourcing of privacy preserving K-means clustering algorithm calculates two main technical difficulties which homomorphic encryption did not solve: comparing the ciphertext data and ciphertext division. For two technical difficulties, we have designed two different protocols secure multiparty computation, one has been completed by threshold encryption ciphertext condition to solve the comparison encrypted data problem. The other has mapped the ciphertext division to privacy preserving weight average problem(PPWAP). Two kinds of ciphertext calculate have corresponded two kinds homomorphic encryption algorithm, namely improved Liu homomorphic encryption has been used to encrypt the threshold and Paillier has been used for PPWAP agreement.In resolving the two-party coalition datasets on K-means clustering algorithm problem, this paper has taken the two party as an example, based on the data mining of traditional privacy preserving data mining. Data distribution has been divided into horizontal data distribution, vertical data distribution and arbitrary data distribution. For each data distribution, different algorithms have been designed for outsourcing of privacy preserving K-means clustering algorithm. In addition this paper have expanded the data distribution of the two parties to the tripartite K-means clustering computing.In this paper, the efficiency of the data owner and the outsourcing server has been considered from the aspects of time complexity and space complexity. The security of privacy preserving K-means clustering algorithm has been implemented in the semi honest model, and has been able to resist a certain degree of collusion attack. This thesis has analyzed and verified the feasibility of the scheme through experiments.
Keywords/Search Tags:outsourcing computing, privacy preserving data mining, K-means clustering algorithm, security multiparty computing, homomorphic encryption
PDF Full Text Request
Related items