Font Size: a A A

Research On Key Technologies Of Privacy-preserving Data Mining On The Cloud

Posted on:2021-10-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:1488306548491294Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Even though the development and popularization of cloud computing technology have brought convenience to people's production and daily life,it also raises data security and privacy issues in the cloud.On the one hand,the data collected or outsourced by the data owners may contain sensitive information.Directly sending them to the cloud may violate data owners' privacy.On the other hand,the protection measure the cloud could provide is not good enough,resulting in external attacks and internal leaks from time to time.Usually,the data perturbation and anonymity technology can be used to protect the security of the data.However,these technologies not only provide a relatively low-security level but also make the data mining results less accurate.Although a normal cryptosystem is able to provide semantic security or even a higher security level,it makes the data unuseful for data mining.Therefore,how to efficiently provide a privacypreserving data mining service to clients in the cloud environment remains a big challenge both for academia and industry.In our dissertation,we first deeply analyzed the security and privacy problems when designed a privacy-preserving data mining scheme in the cloud environment.And then,we designed a privacy-preserving decision tree training and evaluation,k NN classification and association rule mining schemes for balancing the security,useability,and efficiency of the schemes.1.To deal with data security and privacy issues in the cloud when building decision tree models,we proposed three privacy-preserving decision tree training schemes with different privacy levels in the twin-cloud model,which all can work on datasets where transactions are encrypted by different public keys.In these schemes,the data owners can be offline after uploading data to the cloud.Compared with the existing works,the computation and communication costs of ours are less.In addition,because the cloud platform cannot split the encrypted dataset,we propose a decision tree training scheme that uses multi-vector counting instead of data set splitting.Through the security analysis,we can conclude that our schemes can resist the background attack and also is secure under the semi-honest model.From experimenting on the real-world dataset,we show that the proposed schemes have good efficiency.2.At present,most of the works about privacy-preserving decision tree evaluation schemes cannot fully protect the decision tree model.Moreover,in these schemes,service users have large computation and communication burdens.To solves these problems,we proposed two privacy-preserving decision tree evaluation schemes in a twin-cloud model,i.e.,PPDE-DTPKC and PPDE-PSS.Specifically,the PPDE-DTPKC scheme can deal with different service queries encrypted by different public keys.Moreover,both of the PPDE-DTPKC and PPDE-PSS support offline users during the evaluation.Through security analysis,we can see that in PPDE-DTPKC and PPDE-PSS,nothing of the tree model can be leaked to the service users,and at the same time,the clouds also know nothing about the query data and query results.In addition,from the real-world dataset experiments,it can be concluded that the proposed PPDE-PSS is with higher performance when dealing with deep but sparse trees compared with the most related works.3.Nowadays,some of the works about privacy-preserving k nearest neighbor classification are with higher efficiency but lower security,and the others have higher security but lower performance.In order to improve the computational performance of the scheme as much as possible under the premise of ensuring high security,in this paper,using Paillier cryptosystem and additive secret sharing technology,a secure k nearest neighbor classification scheme is designed based on a twin-cloud model.In this scheme,both the data owners and service users do not need to participate in any calculation except uploading data and receiving classification results.Through security analysis,it is proved that the proposed scheme is secure under the semi-honest model and also can resist external attacks.Moreover,our scheme can also protect the data access pattern.Experimental tests on real data sets show that our scheme has significantly reduced computational and communication costs compared to the works with similar security levels.4.Aiming at the data privacy and security problems existing in association rule mining of outsourced data,we proposed a privacy-preserving frequent item and association rule mining and query scheme.With this scheme,the cloud is able to run a secure Apriori algorithm without requiring users' assistance,so as to mine all frequent items and association rules in the outsourced data.In addition,the cloud platform can provide two different types of frequent items and association rule query services(i.e.,user-defined thresholds and cloud-defined thresholds).The scheme proposed supports the secure mining and services on datasets encrypted by different users with different public keys.Through security analysis and real data set experiment tests,the security and efficiency of the scheme are verified.
Keywords/Search Tags:Outsourced Data Security, Privacy-Preserving Data Mining, Homomorphic Encryption, Secure Multiparty Computation, Secret Sharing
PDF Full Text Request
Related items