Multi-agent Deep Reinforcement Learning In Multi-UAV Confrontation

Posted on:2022-05-23

Degree:Master

Type:Thesis

Country:China

Candidate:S G Li

Full Text:PDF

GTID:2532307169478954

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of artificial intelligence(AI)and UAV swarm technology,multi-UAV confrontation has become a hot pot in the military field.MultiUAV confrontation is an extension of UAV swarm technology,which controls the UAV swarm to fight in the air through intelligent algorithms.Based on the multiagent deep reinforcement learning algorithm,this paper uses a variety of techniques to solve problems in multi-UAV confrontation tasks,and builds a reinforcement learning model for multi-UAV confrontation.UAVs are trained by reinforcement learning methods and are tested in the simulation environment of multi-UAV confrontation.The main contributions of this paper are as follows:1.We formally describe the multi-UAV confrontation task and design the motion model of UAV.We build a simulation platform for multi-UAV confrontation and set a target allocation mechanism for UAVs.Besides,we design algorithms to control opponents’ UAVs and our UAVs.Our UAVs adopt the MADDPG algorithm,and opponents’ UAVs adopt a rule-based method.Finally,we use the MADDPG algorithm to train our UAVs,so that our UAVs can effectively intercept the opponents.2.In order to solve the non-stationary problem caused by the changes in opponents’ policies,we improve the Actor-Critic framework based on the MADDPG algorithm.And we propose an additional opponent characteristics method for multi-UAV confrontation,which introduces additional opponent characteristics to model opponents’ policies and indirectly predict opponents’ behavior.Our UAVs trained by additional opponent characteristics method can predict the changes in opponents’ policies,so that our UAVs can make decisions in advance,which reduces the fluctuation in policy learning,making the reinforcement learning process more stable.3.In multi-UAV confrontation,the data input dimension of the centralized Q network is large,which is not conducive to the learning of UAV cooperative policies and makes reinforcement learning to fall into the local optimum.We propose a group-based Actor-Critic method for multi-UAV confrontation,which dynamically groups our UAVs and reduces the data input dimension of the network.Besides,we introduce a double Q network to model the cooperative policies of our UAVs,so that our UAVs can quickly converge to the optimal cooperative policies and learn more advanced group cooperative behavior during the confrontation.

Keywords/Search Tags:

Multi-UAV Confrontation, Reinforcement Learning, MADDPG, Additional Opponent Characteristics, Group Cooperation

PDF Full Text Request

Related items

1	Deep Reinforcement Learning Based Smart Confrontation Using Game Theory
2	Research On Key Technologies Of Deep Reinforcement Learning For Multi-UAV Combat Scenarios
3	Application Research Of Reinforcement Learning On Multi-agent Competition
4	Research On Generation And Optimization Method Of Multi-UAVs Within Visual Range Air Combat Confrontation Game Strategy
5	Research And Implementation Of Multi-USV Formation Path Planning System Based On Deep Reinforcement Learning
6	Research On Cooperation-Confrontation Optimal Control For Multiple Aircrafts Systems
7	Research On Decision-Making Of Beyond-Visual-Range Air Combat Based On Multi-Agent Reinforcement Learning
8	Research And Implementation Of Path Planning Algorithm For Multi UAV Based On Reinforcement Learning
9	Research On UAV Swarm Confrontation Simulation Based On Reinforcement Learning
10	Design And Implementation Of Combat Agent Based On Deep Reinforcement Learning