Font Size: a A A

Method On Federated Learning For The Unbalanced Data Based On Differential Privacy Protection

Posted on:2021-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q HuangFull Text:PDF
GTID:2428330611998842Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,Artificial Intelligence(AI)has achieved great success in many fields,such as face recognition,smart medical treatment,natural language processing and speech recognition.However,there are still two major challenges in the AI field: user data privacy leaks and small and low-quality data.First,because training data involves sensitive information,deep learning methods may be exposed to the risk of privacy breaches.Second,companies or users in the AI field only have very small and low-quality data.And because the manual labeling cost is too large,most of the data is the unlabeled or contains a small amount of label data,which makes it difficult to train a model with high prediction accuracy.Considering industry competition,data privacy protection,and complex administrative processes,it is difficult for the companies to train data together to train AI models,thus creating a “data islands”problem.In addition,since most of the real life exists are the unbalanced data distribution scenarios.That is,some users have a large amount of data,and some users have a small amount of data,which also brings difficulties for the training of federated learning.This dissertation focuses on the training of the federated learning model with privacy protection under the unbalanced data.Aiming at the problems of “data islands”in the current AI field,leakage of user data privacy,and the poor performance of federated learning models caused by the unbalanced data,we propose a differentially private federated learning framework for the unbalanced data.First,in the federated learning,each user only needs to share the model parameters in the local model training process to the third-party cloud platform.The cloud platform integrates the model parameters and delivers them to each user,iterative training,and joint modeling.After the parameters are integrated on the cloud platform,the parameters need to be processed through the encryption mechanism or the noise disturbance mechanism,so as to protect the privacy of the user data.The users do not share the data,but share the model parameters to jointly train the AI model.This indirectly broke the "data islands" problem.Users do not have to worry about privacy leaks caused by data mergence,and improve data utilization.Second,our federated learning framework uses differential privacy to protect the user's data privacy.Differential privacy processing is performed during the training of each user's local model,and then the model parameter updates are uploaded to thecloud platform.And we set different privacy budgets according to the amount of data owned by the different user to deal with the problem caused by the unbalanced data.Third,in the user local model update of the federated learning framework,we propose a novel differentially private convolutional neural networks with adaptive gradient descent(DPAGD-CNN)algorithm.During the parameter optimization process,this algorithm can adaptively adjust the privacy budgets of gradient perturbation and selecting of optimal gradient descent step,effectively improving the prediction accuracy of the model.Experiments show that our federated learning framework is more robust than the existing works;Our framework has a good model performance while effectively protecting the data privacy of users participating in federated learning model training.
Keywords/Search Tags:artificial intelligence, federated learning, privacy protection, differential privacy, the unbalanced data
PDF Full Text Request
Related items