Research And Implementation On Privacy Protection Technology For Training Samples In Machine Learning

Posted on:2021-12-13

Degree:Master

Type:Thesis

Country:China

Candidate:J Chen

Full Text:PDF

GTID:2518306308970239

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

In recent years,with the continuous development and maturity of theory,ma-chine learning technology is being widely used in various industries.Machine learn-ing algorithms learn and mine statistical knowledge from training data and predict un-known things to assist humans in making decisions.Sufficient data is an essential condition for machine learning.Therefore,in order to achieve better performance for machine learning models,massive amounts of data are being collected and utilized.In some specific application scenarios,such as medical treatment and personalized rec-ommendations,user data will inevitably involve personal privacy that is inconvenient to disclose.The availability and privacy of user data should considered when collect-ing and using these data.In order to protect the privacy of the data,this paper proposes a scheme based on particle swarm optimization to generate a new data set from the original data set and train the model with the new data set so that the model does not directly contact the original data set.This article delves into the membership inference attack that can de-termine whether a record is in the model training set.Based on its algorithm principle and attack effects on different public data sets,it analyzes the vulnerability of the at-tack and the sensitivity to different distributed data,and summarize the characteristics of data and models with strong resistance to attacks are presented.Guided by the above analysis results,this paper proposes a method to generate a new data set from the original data set:the method of particle swarm migration.The particle swarm mi-gration method fully considers the vulnerability of the attack and the availability of data in the above analysis results make the new data set generated difficult to be at-tacked and the loss of model accuracy is controllable.In addition,this paper also adds noise to the gradient of the optimal iteration when training the model,so that the train-ing process of the model satisfies differential privacy.In this paper,a comprehensive experiment is performed on the above method on the MNIST data set.The experimental results show that the proposed scheme can well resist the membership inference attack with a small loss of model accuracy.And the particle swarm sample migration method has better protection effect and smaller model accuracy loss than the random noise method.Based on the algorithm proposed above,this paper designs and implements a machine learning model training system that protects the privacy of samples,intro-duces its main functional modules and processes in detail,and proves its usability by experiments.The results show that the system achieves a good protection of the pri-vacy of the training data with a controlled loss of accuracy.

Keywords/Search Tags:

Machine learning, Privacy protection, Differential privacy, Particle swarm optimization

PDF Full Text Request

Related items

1	Optimization And Application Of Privacy Budget In Differential Privacy Protection
2	Research On Federated Distillation Method Based On Differential Privacy Protection
3	Differential Privacy Inference Attack Based On Location Density And Distance Features
4	Research On Differential Privacy Protection For Data Release
5	Research On Privacy Protection For Distributed Computing
6	Trajectory Privacy Preserving Based On Statistical Differential Privacy
7	Research On The Privacy Protection Algorithm Of Clustering Based On Difference Privacy
8	Differential Privacy Preservation In Deep Learning
9	Research On Differential Privacy Modeladn Data Protection In Cloud Computing
10	Research On Differential Privacy Protection Based On Related Attribute