| Artificial intelligence models improve their training performance through large amounts of data sets.However,different data have different values.For enterprises,the development of malicious software identification models has guiding value for the industry.However,with the development of technology and citizens are increasingly concerned about data security and privacy issues.Therefore,malicious software identification technology is to some extent limited by data models and existing knowledge,leading to the problem of "isolated data islands".Traditional malicious software detection models usually use methods such as text or behavioral feature analysis,but they have low efficiency,high false positive rates,and security risks.In this dissertation,in-depth research is conducted on the performance of malicious software identification,data privacy protection,and security.This paper proposes a malicious software identification algorithm based on federated learning,which identifies malicious software by applying federated learning and malicious software visualization technology.Each enterprise acts as a client and only shares the model by passing parameters,using the Fedavg algorithm of horizontal federated learning for training.This algorithm has privacy and security compared to centralized training methods.Furthermore,in practical applications of malware detection based on federated learning,federated learning has thousands of users,including malicious users.Due to the working mechanism of federated learning,this application is vulnerable to attacks.The current backdoor attack schemes require multiple rounds of backdoor training to converge the backdoor model,resulting in additional computational overhead.Moreover,the triggers are easily detectable and reconstructible,which affects the effectiveness of backdoor attacks.This paper proposes a distributed backdoor attack scheme based on generative adversarial networks(GANs).Compared to traditional poisoning attacks,this backdoor attack exhibits stronger concealment and flexibility,and it is applicable to current research models and federated learning in a general sense.This method selects designated label data as the target data,applies the image features of the target data to the original data,and guides the generated data to resemble the target data in terms of features,causing the model to classify these generated data as the label category of the target data.This method has high concealment,while also improving the success rate and convergence speed of backdoor attacks,posing potential security risks to applications of malware detection based on federated learning.In conclusion,this research has made important breakthroughs in the performance of malicious software identification,data privacy protection,and security,providing new ideas for achieving more secure and efficient malicious software detection. |