Research On Malicious Software Identification And Security Based On Federated Learning

Posted on:2024-03-21

Degree:Master

Type:Thesis

Country:China

Candidate:L Han

Full Text:PDF

GTID:2568306941492884

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Artificial intelligence models improve their training performance through large amounts of data sets.However,different data have different values.For enterprises,the development of malicious software identification models has guiding value for the industry.However,with the development of technology and citizens are increasingly concerned about data security and privacy issues.Therefore,malicious software identification technology is to some extent limited by data models and existing knowledge,leading to the problem of "isolated data islands".Traditional malicious software detection models usually use methods such as text or behavioral feature analysis,but they have low efficiency,high false positive rates,and security risks.In this dissertation,in-depth research is conducted on the performance of malicious software identification,data privacy protection,and security.This paper proposes a malicious software identification algorithm based on federated learning,which identifies malicious software by applying federated learning and malicious software visualization technology.Each enterprise acts as a client and only shares the model by passing parameters,using the Fedavg algorithm of horizontal federated learning for training.This algorithm has privacy and security compared to centralized training methods.Furthermore,in practical applications of malware detection based on federated learning,federated learning has thousands of users,including malicious users.Due to the working mechanism of federated learning,this application is vulnerable to attacks.The current backdoor attack schemes require multiple rounds of backdoor training to converge the backdoor model,resulting in additional computational overhead.Moreover,the triggers are easily detectable and reconstructible,which affects the effectiveness of backdoor attacks.This paper proposes a distributed backdoor attack scheme based on generative adversarial networks(GANs).Compared to traditional poisoning attacks,this backdoor attack exhibits stronger concealment and flexibility,and it is applicable to current research models and federated learning in a general sense.This method selects designated label data as the target data,applies the image features of the target data to the original data,and guides the generated data to resemble the target data in terms of features,causing the model to classify these generated data as the label category of the target data.This method has high concealment,while also improving the success rate and convergence speed of backdoor attacks,posing potential security risks to applications of malware detection based on federated learning.In conclusion,this research has made important breakthroughs in the performance of malicious software identification,data privacy protection,and security,providing new ideas for achieving more secure and efficient malicious software detection.

Keywords/Search Tags:

Federated Learning, Malware Identification, Generative Adversarial Networks, Backdoor Attacks

PDF Full Text Request

Related items

1	Research On Evasion Attack Methods For Malware Based On Generative Adversarial Networks
2	Research On Defense Methods Against Poisoning Attacks In Federated Learning Based On Model Parameter Measurement
3	Backdoor Attacks And Defenses On Deep Neural Networks
4	Research On Attack Techniques And Defense Strategies For Federated Learnin
5	Federated Traffic Synthesizing And Classification Using Generative Adversarial Networks
6	Research On Adversarial Sample Generation Method For PE Malware Detection
7	Research On Attack And Defense Methods Based On Federated Learning Network In IoT Scenarios
8	Federated Generative Adversarial Optimization Algorithm For Non-IID Data
9	Research On Roubst Mechanism For Malware Classification Models
10	Research On Federated Learning Methods And Applications For Heterogeneous Data Sources