Font Size: a A A

Research On Data Perturbation Technology Based On Local Differential Privacy

Posted on:2022-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2518306614459824Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
In recent years,the continuous and intensive research as well as development of machine learning techniques have led to great success in various fields,such as speech recognition,image processing,and personalized recommendation.In order to provide more accurate and diverse services,machine learning techniques require huge amounts of data collection and training.However,the user data used for training usually involves sensitive information about individuals.What's more,the direct use of user data for model training has the risk of leaking user privacy.Therefore,there is an urgent need for an effective privacy protection scheme to ensure data privacy.As an emerging privacy protection method,local differential privacy is different from traditional differential privacy in that it assumes that the third parties are untrustworthy and perturbation of data is completed at the data owner to provide more stringent and reliable privacy protection.At the same time,the privacy protection scheme is computationally less expensive compared to encryption algorithms,and thus more suitable for scenarios where large amounts of data are collected and analyzed.Nevertheless,data perturbed using local differential privacy scheme has lower utility,such as using it in scenarios that require more perturbed data such as deep learning to obtain lower accuracy of the model.To solve the problem of low accuracy of the data mentioned above,the thesis proposes a perturbation mechanism for numeric transform to category satisfying differential privacy for continuous numerical data.Different from the existing methods which directly use the corresponding perturbation method to perturb the data,the proposed mechanism first converts the type of the data.It converts the numerical data into one-dimensional binary categorical data and performs perturbation on the categorical data.After the perturbation is completed,the categorical data is inversely transformed into numerical data.Experimental results on real data sets and synthetic datasets show that the perturbation mechanism proposed in this thesis,whether used for mean estimation or experience minimization tasks,obtains less error than the existing methods.Apart from that,it can further improve the utility of data while protecting user privacy.In order to solve the problem of poor availability of local differential privacy mechanism in scenarios with many perturbation parameters such as deep learning,the thesis combines the advantages of encryption algorithms and perturbation mechanisms and proposes a hybrid mechanism of homomorphic encryption and local differential privacy.To better fit the user privacy requirements in practice,this hybrid mechanism uses the NTC mechanism to encrypt the model parameters.Furthermore,in addition to the user's classification,encryption is performed on some of the data that are in demand.In the thesis,experiments and comparisons are conducted on three image datasets using federated learning techniques,and the correctness as well as availability of the privacy protection scheme proposed in this thesis are further verified by comparing the image classification results of different privacy protection schemes.
Keywords/Search Tags:local differential privacy, federated learning, mean value estimation, random response, homomorphic encryption
PDF Full Text Request
Related items