| With the rise of artificial intelligence and IoT technologies,the edge devices we use every day,such as smartphones,wearables and smart home sensors,generate a lot of data every day,which exists in the form of data silos.Ideally,using different training data generated from different users can help improve the generalization performance of machine learning models,so traditional machine learning approaches collect these data into data centers for centralized training,but this data collection practice is not only inefficient but also has serious privacy and security concerns.Federated learning,as a machine learning technology to solve the problems of data silos and data privacy and security,is in the window of parallel research on key technologies and real applications.The scalability of existing federated learning systems is insufficient,and it is difficult to improve the overall efficiency of the system flexibly and efficiently by upgrading and increasing system resources.In this paper,we study how to improve the model training speed by increasing the concurrent processing capacity of server nodes,increasing the number of servers,and optimizing the communication interconnection between servers in order to The main research results include The main research work and results of this paper include:First,this paper proposes a method to improve the server-side concurrency performance of asynchronous federated learning systems from the direction of "vertical scaling".In the asynchronous federation learning system represented by FedAsync,a"shadow model" is introduced to alleviate the data competition among threads within the system to improve the server-side concurrency performance by addressing the problem of "multiple threads competing for global model data" that hinders the server-side concurrency performance.The paper introduces a"shadow model" to alleviate data competition among threads within the system,which improves the server-side concurrency performance and hence the scalability of the federated learning system.Second,this paper designs and implements a prototype asynchronous federated learning system with distributed architecture from the direction of "horizontal scaling".The system uses multiple servers to decentralize the acceptance and processing of service requests from edge devices;because of the high communication overhead between edge devices and servers and the low communication overhead within server clusters,asynchronous communication protocols are used between edge devices and servers,and synchronous communication protocols are used within server clusters.Experiments show that the overall concurrent device processing capability of the system is improved,further improving the scalability of the federation learning system. |