Font Size: a A A

Efficient Federated Learning Framework Design For Non-ⅡD Data

Posted on:2024-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q DuFull Text:PDF
GTID:2568307127460984Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous innovation of algorithms and the enhancement of hardware computing power,artificial intelligence technology represented by deep neural networks has developed rapidly.However,due to the competition among enterprises and platforms,and the promulgation of privacy protection regulations,it is difficult to share data among all parties,and the AI field faces the dilemma of "data silos".In order to solve the problem of data sharing,the concept of Federated Learning(FL)was born.Federated learning replaces data sharing with model sharing,and can accomplish collaborative training and modeling among participants while protecting user data privacy.However,there are many problems and challenges in the research and application of federated learning.Firstly,the problem of Not Identically and Independently Distributed(Non-IID)data due to geographic location and device differences.Secondly,as the client training models continue to expand,the communication efficiency has become a major bottleneck in federated learning.Therefore,this paper focuses on how to improve the federated learning performance in the Non-IID data scenario.The specific work is summarized as follows.(1)To address the problem that noisy data in the client affects the accuracy of federated learning models in realistic scenarios,this paper proposes a federated learning data cleaning algorithm.The algorithm is trained by small-scale benchmark data in the server before the global model is issued by the server,and the benchmark model is shared with the client.In this way the client can screen out the local noisy data,thus improving the accuracy of subsequent model training.(2)To address the problem of data heterogeneity caused by client differences in federation learning,this paper proposes a federated learning framework based on Cut Mix.The framework requires the sharing of local average data among clients,and the gradient is calculated by mixing the average data with private data using the optimized Cut Mix algorithm.By controlling the parameter values,a trade-off can be made between the amount of information exchanged and data privacy,thus effectively improving the performance of the federated learning model under Non-IID data.(3)To address the communication efficiency of federated learning,this paper introduces Count Sketch data structure into federated learning and performs data compression before the client uploads the model data,thus effectively reducing the amount of communication data while ensuring the model accuracy.The error accumulation and momentum accumulation algorithms are added in the server aggregation stage to eliminate the errors caused by the model compression algorithm and accelerate the convergence speed of the model.
Keywords/Search Tags:Federated learning, Not Identically and Independently Distributed, Data cleaning, Data augmentation, Model compression
PDF Full Text Request
Related items