| Intelligent decision-making in air combat will greatly change the form of confrontation in future wars.Domestic and foreign researchers have conducted extensive research on related technologies such as path planning,formation collaborative decision,and air combat simulation system etc.With the rapid development of aviation technology,a large number of low-cost heterogeneous UAVs can be applied in air combat.How to make distributed collaborative intelligent decision-making for heterogeneous swarm has become a very important research direction.In the traditional air combat intelligent decision-making,most concentrate on the complex decision-making for single device or path planning for a very small group,while there is little research on modern air combat countermeasures with large scale and multi-dimensional action space.The thesis focuses on the tactical level strategy generation of air combat countermeasures with large scale and multi-dimensional action space.Based on multi-agent reinforcement learning technology,the thesis builds a hybrid air combat multi-agent system.The system can quickly generate collaborative strategies for heterogeneous swarm,which achieves better performance for specific opponents.Hoping the research can promote more automation and intelligence in air combat decision-making technology.The thesis involves three aspects of content in terms of system design.Firstly,in order to solve the problem of air combat data scarcity and high acquisition cost,designed and implemented a simulation module for heterogeneous air combat confrontation,which implemented key physical models such as aircraft moving and electronic perception in air combat confrontation.Air combat simulator could quickly generate confrontation data and support the training and verification of air combat strategies.Then a hybrid decision support framework was designed,which could support multiple neural network and multiple rule models loading and inference simultaneously on a single unit,adapting to the allocation and management of multiple decision models in practical air combat situation.State machine model and utility model were provided for the design and implementation of rule models,which accelerates implementation of adversarial policy models.Finally,a training module based on multi-agent reinforcement learning technology was designed and implemented.For the characteristics of air combat confrontation problems,designed multiple algorithm optimization methods such as multihead neural network,frame skipping mechanism,low-frequency decision-making mechanism,and adaptive model iteration mechanism,which improved the policy generation effect and efficiency of the training module.In the experiment section,a typical fourteen units VS five units asymmetric air combat confrontation scenario was designed,which has the features of heterogeneous,distributed,partially observable,and multi-dimensional action space.Based on the methodology in the thesis,the air combat confrontation multi-agent system is implemented.Firstly,designs and implements the blue defense controller based on builtin rule modules.Then,the multi-agent training module is used to optimize the red strategy in the simulation module through collected training data.After training,the distributed cooperative confrontation policy of fourteen red units is generated,which can defeat the rule blue with a probability of more than 90%,accomplish the generation of effective tactical strategies.After that,the thesis designed the alternate confrontation training mechanism,to improve the blue team policy performance and the red team policy performance.By the way,generates more diversified and robust policies,and constructed a more complex and dynamic air combat multi-agent system.Finally,by analyzing the tactics and strategies during the adversarial process,summarized and qualitatively analyzed the characteristics and effectiveness of collaborative adversarial strategies. |