In recent years,with the continuous development of data mining,clustering algorithms have been widely used in image analysis,pattern recognition,information retrieval and other fields.As one of the classic algorithms in cluster analysis,k-means clustering to require data support during clustering.However,the data may contain sensitive information about the users.Therefore,k-means clustering bring convenience to our lives,and the security of data is threatened.Privacy preserving k-means clustering can classify data into k categories without revealing data privacy.The existing privacy preserving k-means clustering scheme has the problems of low computational efficiency,ineffective verification,and inability to protect input and output privacy at the same time.Therefore,the in-depth study of privacy preserving k-means clustering has important theoretical significance and application value.(1)The existing cloud outsourcing privacy preserving k-means clustering scheme has the problems of low efficiency and ineffective verification of the final clustering results.Therefore,we proposed a cloud outsourcing verifiable privacy preserving k-means clustering scheme that can be applied to multi-party privacy protection scenarios.First,a new clustering initialization algorithm is proposed,which can effectively to improve the iterative efficiency of the entire clustering algorithm.Then,multiplicative triples are used to assist in calculating the secure Euclidean distance.This allows a large number of calculations to be completed in the offline phase,which effectively improves the calculation efficiency in the online phase.Design a secure minimum algorithm combined with the confusion circuit.The cloud server can update the share value of the cluster center separately,which effectively reduces the time complexity.After the data is outsourced,the entire training of the clustering algorithm is performed on the cloud,which can effectively reduce the interaction between users and the cloud.Finally,a verification algorithm is proposed,so that users only need one round of communication to verify the clustering results returned by the cloud.(2)The existing privacy preserving k-means clustering scheme cannot effectively protect input and output privacy at the same time,and noise may add too many problems.We proposed a multi-party privacy preserving k-means clustering scheme based on differential privacy.In the scheme,a multi-party cooperative initialization clustering center algorithm is designed,which can obtain the initialized clustering center without revealing the privacy of each party.In each iteration,the users and the server need to interactively update the cluster centers.In order not to reveal the privacy of data in the interaction,an algorithm for multi-party collaboration to update cluster centers is proposed.This algorithm combines secret sharing technology to complete the transmission of information without revealing privacy.In each iteration,Laplacian noise is added to the clustering result only once.The amount of noise has nothing to do with the number of participants,and the entire algorithm increases the accuracy of the clustering results.Finally,the noisy cluster centers obtained by the users do not reveal the privacy of the output results.Finally,security proof and simulation experiments are carried out on the two schemes.The experimental results show that the designed scheme has high operating efficiency and clustering quality,and can be completed k-means clustering without leaking users’ data privacy. |