Font Size: a A A

Research On Fair Privacy Gradient Boosting Decision Tree System Based On Trusted Execution Environment

Posted on:2024-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:M HuangFull Text:PDF
GTID:2568307061991769Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Federated learning has become a popular machine learning training mode.Compared to the centralized machine learning training mode,federated learning adopts a local data retention strategy and updates the model by aggregating the local models of each iteration,effectively reducing the data privacy risks that traditional centralized machine learning may bring.In addition,federated learning has also gained attention due to its role in alleviating the challenge of data silos.Gradient Boosting Decision Tree(GBDT)is a highly concerned algorithm in the field of machine learning,which performs well in fitting,prediction,and other aspects.In recent years,research has focused on GBDT systems in federated learning scenarios.Training GBDT models in federated learning scenarios can leverage the advantages of federated training and improve the performance of GBDT models by alleviating data silos and local data privacy issues.However,security and privacy issues remain a challenge in the federated GBDT system.Due to the opacity of training between various clients during federated learning,the integrity of the training process cannot be guaranteed.In addition,during the training process of the federated GBDT model,malicious clients may leak leaf weights and gradients,thereby revealing the privacy information of the training clients.In addition,developing a fair contribution mechanism to motivate clients to participate in federated learning is also one of the challenges faced.Currently,there is relatively little research on fair contribution distribution in the federated GBDT.Therefore,further research is needed on the security,privacy,and fairness incentive mechanisms in the federal GBDT system.This article proposes a secure federated GBDT scheme aimed at achieving fairness and verifiability,as well as a federated gradient enhanced decision tree privacy protection scheme based on a trusted execution environment.The plan mainly includes the following research content:(1)Firstly,this article proposes a secure and fair federated GBDT scheme that relies on SGX to achieve system integrity and performs highly fair contribution calculations between trusted execution environments(TEEs)and external untrusted execution environments.In order to adapt to the limited resources of the device and avoid dishonest behavior during the contribution calculation process(such as modifying the intermediate results of the contribution calculation),we have adopted methods such as "Adaptive Truncated Monte Carlo Approximation Shapley Value Method" and "Truncated and Non Truncated Random Sampling" to alleviate the computational and storage costs of the system,as well as the malicious contribution calculation of malicious participants.This scheme constitutes a complete and practical federated GBDT system.(2)Secondly,this article proposes a federated GBDT privacy protection scheme based on a secure execution environment.This scheme uses a joint training evaluation optimal model strategy to select the optimal model for each round,and protects data privacy by adding noise that satisfies differential privacy to the model.In order to avoid wasting the privacy budget of unselected models,a method of conducting each round of decision tree evaluation in SGX is proposed.At the same time,a "dynamic privacy budget allocation" method adapted to the training characteristics of GBDT was proposed to balance the model accuracy.This scheme provides differential privacy level privacy protection for the federated GBDT model and maintains a high level of model accuracy.In summary,this article investigates the related issues of gradient enhanced decision tree federated learning systems.On the basis of ensuring system accuracy,a fair benefit allocation and verifiable system and privacy protection system are designed,providing a fair,accurate,efficient,and verifiable benefit allocation scheme based on a trusted execution environment in a fair and verifiable system;In the privacy protection system,precise and differential privacy level privacy protection privacy schemes are provided.The comprehensive experimental results indicate that the proposed research scheme has good efficiency and performance.
Keywords/Search Tags:Federated Learning, Quantification of fair contribution, Gradient boosting decision trees, Privacy protection
PDF Full Text Request
Related items