With the widespread use of artificial intelligence and mobile devices,more and more application scenarios require multi-party collaboration,so the demand for distributed machine learning is increasing.However,with the emergence of data islands and the enhancement of privacy awareness,the traditional distributed machine learning training model is facing great challenges.In this new form,federated learning emerged as a privacyprotected distributed machine learning framework to ensure that the data of each client can cooperatively train the machine learning model without being local.However,the existing research has proved that there are vulnerabilities in federated learning that endanger data security.Attackers inside and outside the system can exploit these vulnerabilities to destroy the data privacy of the client.The latest research shows that the training data of the client can be reversed only by the gradient information sent from the client to the server,thereby causing the privacy information of the client to be leaked.Therefore,this thesis aims to study the data privacy security problem caused by gradient leakage in federated learning,deeply analyzes the gradient leakage attack,and proposes an adaptive differential privacy defense scheme.The main research contents of this thesis are as follows:(1)In-depth study of the privacy leakage attack caused by gradient sharing in federated learning.Firstly,it is proved that the gradient of shallow neural network and shallow convolutional neural network can reverse the original training data through mathematical derivation,and the attack is implemented on MNIST data set.Then,this thesis studies a gradient leakage attack against deep neural networks,which updates the fake data by reducing the Euclidean distance between the fake data gradient and the real data gradient.The feasibility of gradient deep leak attack is verified by MNIST,CIFAR10,CIFAR100 and LFW data sets.The experimental results show that the attack can recover the client’s private data through the gradient of the model,so the gradient leakage attack is a kind of privacy leakage attack that can not be ignored.(2)An adaptive differential privacy defense method against gradient leakage attack is proposed.In order to solve the problem that the existing differential privacy schemes usually need to sacrifice the model performance to improve the privacy protection ability,this thesis proposes a differential privacy scheme that adaptively allocates the privacy budget according to the different importance of the neural network layer,inspired by the output of some neurons approaching zero in the iterative process of the model.First,the client pretrains the model using the local data set,calculating the importance of each layer of the model.Different privacy budgets are allocated according to the importance of the model layer.For the network layer with greater importance,less noise is added to improve the accuracy of the model.Then a tighter upper bound of sensitivity is obtained by selecting different clipping values for different network layers.Experimental results show that compared with the traditional differential privacy scheme,the proposed scheme can achieve similar accuracy to the common federated learning.Finally,the proposed scheme is compared with the privacy protection schemes of homomorphic encryption and model compression in terms of model accuracy and performance cost through design experiments,and the effectiveness of the proposed scheme is verified.Gradient leakage attack is used to verify that the proposed scheme has the ability to protect client privacy.Experimental results show that the proposed scheme can defend against gradient deep leakage attack. |