| Multi-Agent Reinforcement Learning(MARL)is a machine learning method that uses reinforcement learning techniques to realize the interaction between multiple agents.MARL has the characteristics of flexibility,robustness,and scalability,and has been widely used in areas such as autonomous driving,cooperative control,and multi-agent games.However,current research has shown that reinforcement learning systems are vulnerable to malicious attacks,such as adversarial example attacks,backdoor attacks,and data poisoning attacks,etc.As an important branch of the field of reinforcement learning,multi-agent reinforcement learning also has the risk of being maliciously attacked.However,unlike single-agent reinforcement learning,the attacker in multi-agent reinforcement learning needs to consider the interaction and influence between multiple agents,which poses a great challenge to the formulation and implementation of attack strategies.Studying attack strategies helps to better understand the loopholes and vulnerabilities of multi-agent reinforcement learning systems,and is of great significance for improving the security of multi-agent reinforcement learning.This paper adopts a fully decentralized multi-agent reinforcement learning framework,and proposes a covert backdoor attack and an efficient adversarial sample attack method for the training phase and inference phase of the agent model:(1)Covert backdoor attack method based on reward poisoning: By studying the existing backdoor attack methods,it is found that the simple backdoor attack uses a very obvious trigger,which makes the backdoor attack easy to be discovered and detected.Based on the multi-agent competitive environment,this method designs a trigger in the distribution of the environment state.The trigger belongs to the normal state in the environment,so that the trigger is not easy to be detected by humans.And by using the characteristic that the output of the agent’s action is affected by the state of the opponent in the multi-agent competition environment,the opponent of the victim agent can also participate in the formation of the trigger,thereby improving the concealment and success rate of the backdoor attack.(2)Efficient adversarial sample attack method based on Adv GAN:By studying existing adversarial example attack methods in reinforcement learning,it is found that there is still a lot of room for optimizing attack strategies and adversarial example generation.Based on the multi-agent collaboration and competitive environment,the method introduces reinforcement learning algorithm and Adv GAN method to optimize the attack strategy and generate adversarial samples,improve the attack efficiency of adversarial samples,and realize end-to-end fast,stable and natural adversarial example attacks on multi-agent environments with the minimized number of attacks.Finally,this paper demonstrates the effectiveness and efficiency of Adv GAN-based adversarial example attack method through experiments in multiagent cooperative and competitive environments. |