| With the continuous development of the information and digital society,the world is producing a large amount of data all the time,and extracting valuable information from these data will bring huge economic benefits to the society.But at the same time,it is easy to generate data security problems in the process of extracting value from data.Due to concerns about data security,data owners are reluctant to share their data with others,which leads to the problem of data silos,which greatly limits the development of machine learning.In response to this problem,Google proposed the concept of federated learning in 2016.Federated learning is a machine learning algorithm that uses data locally and multi-user to train joint models.It aims to protect user data privacy and solve the problem of data silos.However,as researchers conduct in-depth research on federated learning,it is found that there are still data security problems in federated learning.For example,the attacker deduces the original data of federated learning participants by obtaining the gradients generated during federated learning training.At the same time,attackers can also attack the training model through data poisoning,thereby affecting the training of the model.Based on this,this thesis proposes a participant evaluation algorithm and a holistic aggregation algorithm for horizontal federated learning,build a secure horizontal federated learning framework.The main work is as follows:(1)The participant evaluation algorithm is proposed.For the problem of poisoning attacks or malicious users in horizontal federated learning,the participant evaluation algorithm compares the results of cluster analysis,data set trend analysis and data label correlation analysis to find out the uneven data sets in model training,and then Protects models from poisoning attacks.(2)The holistic aggregation algorithm is proposed.Aiming at the problem of gradient leakage during the training of the federated learning model,the overall aggregation algorithm can ensure that the gradient remains in the ciphertext state after leaving the local,and at the same time,the aggregation operation of the gradient in the ciphertext state can be performed on the server.In this process,a trusted third party is introduced to generate keys to ensure that the server cannot decrypt the gradient.The overall aggregation algorithm can prevent gradient leakage and ensure the security of gradient.(3)A secure horizontal federated learning framework based on cluster analysis and homomorphic encryption is constructed.The framework consists of a participant evaluation algorithm and the holistic aggregation algorithm,which can effectively defend against poisoning attacks initiated by attackers and prevent data leakage,thereby ensuring data security. |