Multi-model Federated Learning Based On Distilled Data

Posted on:2022-06-11

Degree:Master

Type:Thesis

Country:China

Candidate:M Sun

Full Text:PDF

GTID:2518306572955169

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

In big data era today,the effective integration of multi-party data resources can promote the development of all walks of life.However,data sharing is often limited by privacy treaties.Federated learning allows multi participants to train a model collaboratively without exposing their local data.While in practical application,the problem of Non-IID among participants is often accompanied,which restricts the further development of federated learning.Therefore,it is very meaningful to design an effective framework for the Non-IID problem in federated learning.Clustered federated learning algorithm(C-FL)provides an idea to solve the Non-IID problem by training multiple models,which is currently applied to the general framework of clustered federated learning(CFL).However,when there are many different data distributions between clients,the entire federated training process needs multiple cloudedge information transmission,especially when the federated model structure is complex,it will produce a lot of communication costs.At the same time,there is a risk of disclosing the privacy of user data when transporting client local model updates.This paper mainly studies the training of horizontal federated learning model in the Non-IID situation of client data,and overcomes the deficiencies in the general CFL framework.Firstly,this paper presents a distilled clustered federated learning algorithm(DC-FL)with privacy protection capabilities,optimizes the C-FL algorithm on the cloud-edge communication rounds by using local distilled data from each client to guide the server to group clients,which ultimately helped the client to train their unique personalized model.Secondly,based on the DC-FL algorithm,this paper designs a new framework for clustered federated learning based on distilled data(CFL-D),which overcomes the limitation of communication cost of the general CFL framework in the case of large client data distribution while guaranteeing user data privacy,realizes multi-model federated learning training.Finally,the validity of the general CFL framework to solve the NonIID problem of client data is verified experimentally,and the CFL-D framework is implemented.At the same time,this paper compares the CFL-D framework with the CFL framework in terms of communication rounds and traffic.Experiments show that the CFL-D framework can meet the requirements of user privacy protection.On EMNIST dataset,the number of cloud-side communications and traffic in total are relatively reduced by 24.62% and 21.29%,respectively,compared with the CFL framework,and the reduction is more pronounced when there are more types of client data labels.

Keywords/Search Tags:

federated learning, user clustering, dataset distillation, individualized model, non-independent identically distributed

PDF Full Text Request

Related items

1	Distilled Federated Learning Based On Tree Model
2	Optimization Method For Federated Learning Model With Unbalanced Dataset
3	Research And Application Of CTR Prediction Algorithm Based On Federated Learning
4	A Study On Few-shot Learning Of Non-independent Identically Distributed Data
5	Research On Client Server Optimization Method Of Federated Learning
6	Data-Oriented Federated Learning Research
7	Research On Federated Distillation Method Based On Differential Privacy Protection
8	Decentralized Federated Learning Algorithm And Framework Based On Blockchain
9	Design And Implementation Of Federated Learning Platform Based On Android
10	Key Techniques Of Signal Processing For Distributed Satellite Interferometic Synthetic Aperture Radar