Distilled Federated Learning Based On Tree Model

Posted on:2022-10-30

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Shan

Full Text:PDF

GTID:2518306572455174

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence technology,a lot of data-driven artificial intelligence technologies are playing important roles in all aspects of society.Due to the lack of data in quantity and quality,as well as privacy protection agreements in some enterprises,nowadays,artificial intelligence industry is still faced with two major problems:one is the barrier of data,the other is privacy protection problem.Federated learning appears to be a new way to solve these problems.It enables enterprises participating in federated learning to transmit data across devices and learn collaboratively under privacy protection.Through this special distributed learning,enterprises can transmit model update information(i.e.,gradient information)and learn federated model together,thus achieving resource sharing.However,there are still three key problems in federated learning to solve,namely communication cost,data heterogeneity and privacy protection.A new federated learning framework based on distilled data is proposed in this thesis,which transmits the synthetic distilled data between clients instead of the parameter update information in the traditional federated learning to work in these there problems.While the target of the traditional dataset distillation algorithm is image classification,and for structured data,some tree ensemble models,such as random forest and XGBoost,are widely used in the industry.Thus he thesis improves the algorithm to deal with the structured data specifically.The thesis proposes a new dataset distillation algorithm based on decision tree according to the way of tree model generation.The algorithm aggregates the original dataset and model information into a small dataset.In addition,this paper also constructs a distilled-data-based federated learning framework based on tree model,which can choose different tree models or tree ensemble models,such as random forest,according to different types of data.The algorithm has been verified theoretically and experimentally,that the distilled data are synthetic data and won't reveal the specific data information.The transmission of distilled data enables distilled federated learning to reduce the communication cost and balance the data distribution of each client,as well as make the learning effect close to the upper limit of federated learning theoretically.

Keywords/Search Tags:

federated learning, dataset distillation, decision tree, non-independently and identically distributed

PDF Full Text Request

Related items

1	Multi-model Federated Learning Based On Distilled Data
2	Research And Application Of CTR Prediction Algorithm Based On Federated Learning
3	Optimization Method For Federated Learning Model With Unbalanced Dataset
4	Research On Client Server Optimization Method Of Federated Learning
5	Regression Learning With Non-identically And Non-independently Sampling
6	Research On Federated Distillation Method Based On Differential Privacy Protection
7	Decentralized Federated Learning Algorithm And Framework Based On Blockchain
8	Research On Performance Optimization Of Federated Learning For Data Heterogeneity
9	Research On Privacy Protection Technology In Federated Learning
10	Research On Distributed Denial Of Service Attack Identification Based On Federated Learning