| Multi-agent system,as one of the important technology components of artificial intelligence,has been deeply concerned by the political,economic,military and other social sectors.Early researches on artificial intelligence mainly focus on applications with small number,single conditions and simple and unchanged environment.The single-agent method is not only inefficient,but also difficult to deal with the dynamic changes caused by the rapid increase in the number of agents and the increasingly complex environment.Highly intelligent and autonomous UAV cluster,as the focus of social attention in the future,urgently needs to achieve coordination,independent decision-making,and build a UAV control system in a complex and harsh environment.As an important part of UAV cluster control technology,UAV cluster path planning and task allocation are faced with many difficulties,so this paper studies them.In order to overcome the difficulties mentioned above,such as exponential disaster and complex and variable environment,and further complete the path planning and task allocation of UAV cluster quickly and effectively,this paper uses reinforcement learning and game theory to carry out some research and exploration on the path planning and task allocation of UAV cluster under incomplete information environment,and makes some achievements.The main work contents are as follows:The path planning and task assignment of traditional single and multiple UAVs are mainly modeled or analyzed in the known environment.In this paper,the UAVs and their environment are set,and the UAVs cluster can only observe environmental factors within a certain range around itself and randomly distributed threat areas.According to the characteristics of path planning and task assignment of UAV cluster in incomplete information environment,the corresponding income and constraint functions are established in this paper.Path planning for unmanned aerial vehicle(uav)under uncertain information environment cluster problems,an improved multi-agent uncertainty policy gradient algorithm(Multi-Agent Deep Deterministic Policy Gradient MADDPG)algorithm.Firstly,an algorithm model is established according to the path planning problem of UAV cluster in incomplete information environment,and then the related parameters are analyzed and discussed.Secondly,in order to improve the wayfinding efficiency of UAV in incomplete information environment and overcome the problem that UAV is difficult to deal with unknown threat area,a pseudocollision constraint reward is set.Finally,by improving the MADDPG algorithm,the route planning experiment of UAV cluster in incomplete information environment is completed.In order to verify the effectiveness of the algorithm,a comparison experiment was set for algorithm parameters,UAV scale and the MADDPG algorithm.The simulation results show that the improved MADDPG algorithm can complete the UAV path planning under incomplete information environment and has good robustness.Task allocation for unmanned aerial vehicle(uav)cluster and drones in the incomplete information environment of cooperation and competition in the group problem,an improved strategy of multi-agent proximal superior(Multi-Agent Proximal Policy Optimization MAPPO)algorithm.Firstly,a corresponding algorithm model is established for the task assignment characteristics of UAV cluster in incomplete information environment.Secondly,according to the incomplete information game theory,the MAPPO algorithm is improved to solve the problem of cooperation and competition within UAV clusters.Finally,through simulation experiments,it is verified that the improved MAPPO algorithm can effectively solve the task allocation problem of UAV cluster in incomplete information environment.Meanwhile,compared with the traditional heuristic algorithms PSO and GA,it is verified that the MAPPO algorithm has higher robustness and practicability for the cooperation and competition problem of UAV cluster group. |