Font Size: a A A

Research On Generation And Optimization Method Of Multi-UAVs Within Visual Range Air Combat Confrontation Game Strategy

Posted on:2022-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:T R JiangFull Text:PDF
GTID:2532307169980009Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With unmanned aerial vehicle(UAV)technology and artificial intelligence development,the application of UAV in the future battlefield is pervasive.The coordinated air combat of multiple UAVs will become a critical combat style of intelligent warfare in the future.Under the conditions of solid electromagnetic countermeasures and the UAV lacks the support of the combat system,it can only rely on the perception of its sensors to conduct air combat games beyond visual range.UAVs require higher autonomous decision-making capabilities in an air combat game environment with high confrontation and complex decision-making.However,multi-UAV air combat decision-making is difficult to experiment with within the virtual environment.It is necessary to learn and optimize the UAV confrontation strategy in a simulation environment to provide a decision-making method for multi-UAV air combat confrontation and improve UAVs’ decision-making and coordination capabilities in air combat.In order to improve the decision-making and coordination ability of UAVs in air combat,this thesis proposes two countermeasures generation and optimization methods of multiple UAVs based on human empirical knowledge and data-driven learning.The main research contents of this thesis include:(1)We propose A strategy generation method for multi-UAV air combat based on a logical behavior tree.Because of the complexity of decision-making in multi-UAV air combat,we study the situation analysis method of multi-UAV coordinated combat,a behavior tree’s hierarchical decision modeling method,and a logical constraint rule library.Secondly,establish the interaction relationship between the behavior tree and logic constraint.Finally,we propose a hierarchical decision-making method of behavior tree with logical constraints.In the experiment,we use this method’s decision-making model,and the confrontation experiment carries out with the pre-planned blue agent in different scale confrontation scenarios.Data such as the victory rate and average score of the confrontation show that the method can effectively apply to multi-UAV beyond visual range confrontation.In addition,this method applies to the wargaming agent game.After80 points competition,we obtain a runner-up result,which further verifies the method’s effectiveness.Although this method can reduce the search of the strategy space in the confrontation and obtain effective strategies,as the scale of the confrontation increases,it is tough to expand the structure and rules of the behavior tree,and the generated strategies are challenging to generalize new ones.The environment does not adapt to the changes in the enemy’s number and strategy during the confrontation.(2)A multi-agent reinforcement learning method based on the self-attention mechanism is proposed to learn the multi-UAV countermeasures strategy.In order to reduce the dependence on prior knowledge,this thesis uses multi-agent reinforcement learning algorithms to learn adversarial strategies.In the course of confrontation,the constant changes in the number and strategies of the enemy and ourselves lead to the unsteady learning process and the dynamic changes of the environment that the strategy does not support adaption.This thesis is in the Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm.The introduction of a self-attention mechanism allows the UAV to select relevant information during the confrontation automatically.The UAV dynamically adapts to confrontation scale and strategy changes during the learning process and promotes better learning and collaboration.The experiment constructed a typical confrontation scenario in the simulation environment of a simplified confrontation model and compared it with different multi-agent reinforcement learning algorithms.The test results of confrontation between different methods show that our method is more stable than other methods,and it effectively improves the collaborative confrontation ability of multiple drones.However,it is challenging to learn effective strategies in larger-scale confrontations,and it is difficult to expand the scale of confrontations.(3)A strategy generation method of multiple UAVs based on population evolution learning is proposed,expanding the scale of multi-UAV confrontation and ensuring stable learning.Aiming at the problem that it is challenging to learn strategies after the multi-UAV confrontation scale is expanded effectively,this thesis uses curriculum learning to increase the confrontation scale gradually.It uses evolutionary algorithms to select adaptable individuals from multiple populations to expand the scale.This method enables UAVs to learn strategies after scaling up effectively and multiplies the scale of multi-UAV confrontation.The comparative experiment with different methods showed that the strategy learned by this method showed more vital confrontation ability after the scale-up after the same training episodes.
Keywords/Search Tags:Multi-UAVs, Air combat confrontation, Strategy generation, Multi-agent reinforcement learning, Curriculum Learning, Behavior tree
PDF Full Text Request
Related items