| As one of the important algorithms in the field of machine learning,kNN classification algorithm realizes data prediction through theoretical analysis of existing data,and plays an important role in disease pre-diagnosis,stock trend prediction,online sales precision marketing,recommendation services and other fields.As the kNN classification is widely used in various fields of life,the problem of data privacy leakage in the algorithm itself has attracted extensive attention from industry and academia.With the advent of the "big data" era,the amount of data has increased significantly,and local client resources have been unable to meet the storage and computing needs of the "big data" era.The emergence of cloud computing has solved this dilemma.The advantages of flexibility,high storage capacity and powerful computing power,and the local client can access the data stored in the cloud at any time and outsource the complex data calculation to the cloud server,so cloud computing is more and more popular in practical applications.However,when a client with limited storage and computing resources outsources data and computing tasks to an untrusted cloud server,the cloud server also gains the power to access the data,and the cloud server provider may be tempted to sell the stored data to the third parties,which makes the issue of data privacy breaches even more prominent.Therefore,it has become an inevitable trend to study the kNN classification algorithm that supports data privacy preservation.The problem of data privacy preservation in kNN classification and untrusted cloud computing services can be solved by secure multi-party computing technology,which can realize joint computing of multiple participants without revealing their private information.At present,academia has proposed many privacy-preserving kNN classification schemes,but there are still some challenges,such as classification efficiency,classification accuracy,whether to support weak clients,and whether clients can remain offline during the classification process,etc.The above challenges are also direction of exploration for this paper.This paper implements privacy-preserving kNN classification schemes from a multi-party computing environment to a two-party computing environment and the main work accomplished as follows:(1)A cloud-assisted privacy-preserving kNN classification scheme in a multi-party computing environment is proposed.In view of weak clients with limited storage and computing and untrusted cloud services,this paper splits data based on secret sharing technology,outsources the split secret share to 2 non-collusion cloud servers,based on secure two-party multiplication computation protocol implements the secure Euclidean square computation protocol;On the basis of the non-collusion dual cloud server,a thirdparty cloud auxiliary server is added to extend the secure minimum protocol,and the secure Top-k selection protocol is proposed to achieve the top k minimum.Based on the secure Euclidean square calculation protocol and the secure Top-k selection protocol,a cloud-assisted privacy-preserving kNN classification scheme is proposed.The experimental results show that the scheme can achieve data privacy preservation and classification process efficiency while guaranteeing classification accuracy.(2)A privacy-preserving kNN classification scheme based on K-means clustering in a two-party computing environment is proposed.In the cloud-assisted privacy-preserving kNN classification scheme,the basic protocol is executed directly on the original dataset,and the sample size to be computed is large,thus generating a large computational and communication overhead,leading to a reduction in classification efficiency,and it is more difficult to find three non-collusion cloud servers in practical applications.To address the above issues,it is proposed to do K-means clustering on the dataset before the scheme starts,do secret sharing splitting of the clustered centers and clusters after clustering,outsource the split secret share to dual cloud server,and later execute the scheme based on the clustered centers and specific secret share of samples within clusters,which reduces the number of samples to be computed.Improving the secure Top-k selection protocol and proposing a Top-k selection protocol that is applicable to the two-party computing environment.Experimental results show that the privacy-preserving kNN classification scheme based on K-means clustering in a two-party computing environment is greatly optimized in terms of performance.The experimental validation part of this paper uses Iris,Wine,Car Evaluation,Nursery,and Waveform datasets.The experimental results show that the cloud-assisted privacy-preserving kNN classification scheme achieves data privacy preservation while the accuracy remains almost the same as the plaintext kNN classification accuracy;the accuracy loss of the secure two-party kNN classification scheme based on K-means clustering only in the range of 0.003~0.0212 compared to the plaintext kNN,but the time overhead of this scheme is reduced by 71.53% compared to the cloud-assisted privacypreserving kNN classification scheme,and the efficiency is significantly improved,achieved an efficient privacy-preserving kNN classification algorithm.In addition,the two schemes proposed in this paper both support the data owner without real-time online after completing the distribution of the secret share. |