Font Size: a A A

Research On Federated Optimization Algorithms For Data Heterogeneity And Device Heterogeneity

Posted on:2024-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2568307103470054Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of technologies such as the mobile internet and the Internet of Things,the concept of " Internet of Everything" has become a new trend in information technology.The explosive growth in the number of terminal devices has led to a drastic increase in user data,driving the development of using artificial intelligence technology for massive user data analysis.To protect privacy,some sensitive user data,such as health information and identity information,needs to be kept isolated from one another in networks.However,traditional centralized models in artificial intelligence training have a high risk of privacy leakage and cannot meet current needs.Federated learning,on the other hand,has emerged as a new research hotspot as it enables artificial intelligence modeling while ensuring user data isolation.Federated learning is a distributed machine learning technique that coordinates multiple clients,with each client training its own model.The central server aggregates the client models to generate a global model.This process is repeated until the global model converges,resulting in the final global model.This method ensures the privacy and security of client data by not uploading the original data to the server.However,in practical applications,two common issues exist:(1)Data heterogeneity leads to low accuracy of the global model.Due to the influence of the client’s own preferences,the data sets held by the clients overall exhibit non-independent and identically distribution(Non-IID)characteristics,or data heterogeneity.This characteristic can lead to significant deviations between models of different clients and impact the accuracy of the final global model.(2)Device heterogeneity leads to low model training efficiency.Clients in federated learning can come in various types of end-devices,such as phones,computers,and smart home devices,etc.The system resources,such as computation capability,communication bandwidth,and storage capacity,of different devices often differ significantly,resulting in device heterogeneity.In the process of model aggregation,the server sets the same local epochs for each client,which results in faster clients idling for a long time waiting for slower clients,greatly reducing the overall training efficiency.In the current research on data heterogeneity problems in federated learning,gradient correction methods often result in a doubling of the federated system communication,with a high communication cost.Regularization methods,on the other hand,do not show significant global accuracy improvement due to the lack of targeted optimization of the internal structure of the model in the design of the regularization term.In the research related to device heterogeneity problems,asynchronous communication schemes cannot be applied to the federated system due to the unfulfilled assumption of bounded delay,and active client sampling can improve the training efficiency to some extent,but may also result in a decrease in the overall model accuracy.This paper presents solutions to the above-mentioned problems,with the main research efforts focusing on the following aspects:(1)To address the issue of low global model accuracy caused by data heterogeneity,the paper proposes a federated optimization algorithm called TS-Fed PC.The algorithm divides the model into two parts: an encoder and a classifier,and through the training of the prototype comparison of the encoder and the unbiased correction of the classifier,it weakens the impact of heterogeneous data on the internal structure of the global model in a targeted manner,effectively improving the accuracy of the global model in data-heterogeneous environments.Experimental results show that compared to other similar algorithms,TS-Fed PC can achieve the highest model accuracy on different datasets,while maintaining low communication volume.(2)To address the issue of low model training efficiency caused by device heterogeneity,a federated training method based on dynamically setting local epochs is proposed.The method is based on the idea that the training time of the client model can guide the dynamic setting of local epochs.Firstly,a weight model is designed to analysis the impact of the model on the training time.Then,combining with dimensionality reduction rules,the key features that have a greater impact on the training time are extracted,reducing the amount of data and features needed to predict the training time,improving the feasibility of the model training time prediction algorithm in the federated system.Finally,a dynamic setting of local epochs algorithm based on client training time prediction is provided to improve overall training efficiency.
Keywords/Search Tags:Federated Learning, Data Heterogeneity, Device Heterogeneity, Prototype Contrastive Learning, Time Prediction
PDF Full Text Request
Related items