| With the development of information technology,different data centers,such as hospitals,banks,etc.,produce more and more data.These rich data have also become an important resource for deep learning.A general requirement of deep learning is to aggregate all the data together.However,due to privacy protection and limited communication conditions,it is often difficult to aggregate data.In order to make use of these data,federated learning algorithms were introduced,which collaborate with multiple client datasets located in different geographical locations to complete the training of machine learning models.However,how to combine models from various clients to obtain a global model with high generalization ability,especially in the case of inconsistent client data distribution,is still a problem that scholars pay attention to.In order to further improve the generalization ability of the global model,the paper firstly improves the traditional federated learning algorithm and propose a composition-decomposition based federated learning(CD-FL)algorithm.In the composition-decomposition based federated learning,the global model on the server is integrated by several sub-models of the same architecture.When the global model is decomposed into sub-models and sent to each client,each client randomly selects a sub-model and updates the sub-model with its local data.The updated model is uploaded to the server,clustered with the original sub-models,and reassembled into a global model on the server.The final global model is obtained after the iteratively update.Experimental results on EMNIST,FASHION-MNIST,CIFAR-10 and TINY-IMAGENET data sets show that the composition-decomposition based federated learning has better generalization ability than the existing representative methods,and reduces the total communication cost.Then,in order to further improve the generalization performance of federated learning algorithms,this paper carried out research from the perspective of model generation,and proposed a federated pre-training algorithm based on ensemble and aggregation(Fed Pre algorithm).In the federated pre-training algorithm,the server saves multiple sub-models to construct a model archive set.When the model archive set is sent to each client,each client randomly selects a sub-model from it and updates the sub-model with its data.The updated models are uploaded to the server,and the server aggregates the models which are received from clients,to obtain a new model archive set.In this way,the model similarity is continuously reduced in the model archive set,and finally the initial model parameters with high performance which are suitable for federated learning are obtained.Experimental results on FASHION-MNIST and CIFAR-10 datasets show that the federated pre-training algorithm based on ensemble and aggregation can improve the generalization performance of existing representative methods. |