| Distributed machine learning is proposed to break the problem of data silos.And federated learning can be trained between servers and clients without exchanging personal data locally.Therefore,users’ private data can be protected by federated learning.However,users’ private data may still be leaked to some extent by analyzing the differences in trained and uploaded parameters from clients,such as the weights trained in deep neural networks.To address this problem,differential privacy with lightweight advantages is widely applied to implement privacy enhancement for federated learning.Whereas there are still some drawbacks: 1)Users’ privacy requirements can be various according to occupations and geographies.By now,most works assume that privacy budgets are equally distributed.But this assumption is unrealistic due to different legislations,countries,or occupational contexts.2)Uniform privacy budget levels mean that some clients waste large amounts of privacy budgets,which often has negative impacts on the accuracy of models.3)Current differential privacy-based federated learning clients’ selection mechanisms do not take users’ data quality changes into account after adding additional noise.What’s more,it still uses the same probability to select clients,which can reduce the training efficiency of the model and cannot quantify the degree of privacy protection.In this paper,a solution for personalized privacy-preserving federal learning is proposed to address the shortcomings of the above work as follows:(1)To meet the personalized privacy requirements from users under federated learning and satisfy the differential privacy constraint,a two-stage federated learning with personalized differential privacy(PDP-FL)algorithm based on personalized differential privacy is proposed.Users can set their privacy preferences locally.The PDP-FL algorithm ensures differential privacy constraints for each round of federation learning by adding another round of fine noise to the central aggregation phase of federated learning.On the one hand,the personalized differential privacy algorithm adds the appropriate noise to reduce the waste of privacy budgets according to the user’s privacy preferences.On the other hand,the algorithm controls the size of the added noise by setting the privacy budget’s threshold that needs to be satisfied for global differential privacy.Therefore,it can quantify the degree of privacy protection while providing global protection for federal learning.It also achieves the protection of both local and central privacy while satisfying the individual privacy needs of users and quantifies the global privacy protection strengthens for the first time.The experimental results show the PDP-FL algorithm achieves improved classification accuracy in multiple scenarios and meets the need for personalized privacy protection,compared with other differential privacy federated learning methods.(2)To further solve the problem of excessive noise added by some users and improve the model of training accuracy,an unsupervised personalized federated learning client selection(PCS-FL)algorithm is proposed.The unsupervised client selection algorithm selects users with a higher data value from those participating in federated learning by calculating the similarity of user’s data.It is difficult to use labels to measure the data quality of users’ dependent identically distributed data in the federated learning scenario,and unsupervised learning can solve the problem of user data without labels.Therefore,it can improve the model accuracy of federated learning.In addition,the unsupervised federated learning client selection mechanism only requires one upload of client similarity parameters.This means that PSC-FL does not increase the communication overhead of federated learning.The experimental results show that the proposed PCS-FL algorithm improves the classification accuracy of the model in several scenarios. |