Research On Federated Learning Methods And Applications For Heterogeneous Data Sources

Posted on:2022-12-13

Degree:Master

Type:Thesis

Country:China

Candidate:S Xi

Full Text:PDF

GTID:2518306614460714

Subject:Investment

Abstract/Summary:

PDF Full Text Request

Federated Learning(FL)is a distributed machine learning method that enables multi-party data to collaboratively learn a common model without the need to collect data locally,centrally transfer and store it in a server.As a result,the resource consumption on the cloud is reduced and the privacy of the client is enhanced.However,different from traditional artificial intelligence techniques,there are great challenges in this distributed learning method and its application.First,the clients are independent from each other and do not share data.Data collection methods and data sources lead to different distribution among data sets,which seriously affects the performance of the training model.Second,when federated learning is applied in the field of medical diagnosis,the scale of the data set is relatively small compared with other common public data sets,which may make the performance of the training model not trusted by hospitals and so on.Thirdly,medical data involves the privacy data of patients in many aspects.While addressing the heterogeneity and availability of data,privacy protection of data sets may be neglected,resulting in the risk of disclosure of real data.Therefore,in the framework of federal learning,privacy security is always a great challenge.In order to solve the above problems,FedSim model is proposed based on the federated learning algorithm of cosine similarity.For the heterogeneity of data,the non-independent and identical distribution(Non-I.I.D)problem of the data set is mainly considered.In the aspect of local imbalance,the calculation of loss function is improved to force the local model to accelerate convergence and improve the performance of the model.In the aspect of global imbalance,the cosine similarity between the global distribution and the local distribution is taken as a new weight for server aggregation to alleviate the model performance degradation caused by the Non-I.I.D problem.The residual neural network model was selected for the model training on the local client,and the experimental results showed that the model was superior to the baseline model under different Non-I.I.D settings.In the applied research of federated learning,this paper chooses to perform diagnostic classification on chest X-ray images of patients with novel coronavirus(COVID-19)pneumonia.This paper proposes a Federated Differential Privacy Generative Adversarial Network(FedDPGAN)model.Specifically,this paper uses distributed DPGAN to generate different patient data and increase training samples.In the training process of the GAN model,the discriminator needs to distinguish the generated data from the real data.In order to protect data privacy in this process,differential privacy technology is introduced to ensure the privacy and security of the real training data.More importantly,this model can alleviate the influence of non-I.I.D problem of training data on model performance.In the aspect of experiments,this paper selects multiple baseline models to compare with this model,and tests the diagnostic accuracy of the model under the I.I.D and Non-I.I.D settings of the dataset,while considering the accuracy of model under different degrees of privacy protection.In the above experiments,the performance of this model is better than that of the baseline model.To sum up,this paper proposes the FedSim model for the federated learning algorithm and the FedDPGAN model for the medical diagnosis of federated learning application for the Non-I.I.D problem of the dataset,and evaluates the model performance based on the FEMNIST dataset and the COVID-19 dataset.

Keywords/Search Tags:

Federated Learning, Non-independent And Identical Distribution, Cosine Similarity, Differentially Private, Generative Adversarial Networks

PDF Full Text Request

Related items

1	Data-Oriented Federated Learning Research
2	Federated Traffic Synthesizing And Classification Using Generative Adversarial Networks
3	Federated Generative Adversarial Optimization Algorithm For Non-IID Data
4	Research On Transfer Learning Based On Generative Adversarial Networks
5	Research On Adversarial Generation Network Based On Similarity Evaluation
6	Research On Image Processing Algorithm Based On Non-independent Identical Distribution
7	Research On The Generation Of Independent Poster Based On Generative Adversarial Network(GAN)
8	Image Blind Motion Deblur Algorithm Based On Generative Adversarial Networks
9	Research On Image Generation Method Based On Generative Adversarial Network
10	Research Of Recommendation Algorithm Based On Joint Similarity And Generative Adversarial Nets