Aiming at the problem of intelligent search and rescue after urban disasters,this paper divides the original search and rescue mission into two stages for problem abstraction and formal modeling.In order to grasp the basic information in the rescue area,the first stage of rescue is to use lightweight unmanned equipment to cover the rescue area.On the basis of preliminary grasp of basic information,the second stage dispatches multiple unmanned rescue platforms for coordinated rescue.Aiming at the problems of large state space of traditional models and poor real-time performance when solving online,this paper studies the collaborative completion of urban search and rescue tasks by multiple unmanned platforms based on neural combinatorial optimization methods and deep reinforcement learning technology.The main research content and innovations of this article include:(1)Proposed a regional coverage model for urban rescue.This paper conducts an abstract modeling of the coverage search problem in the urban search and rescue process,and builds a multi-unmanned platform coverage search model.An optimization index that is more suitable for the task requirements is designed,and the route with the largest travel path among all task vehicles is optimized to minimize the path cost of the vehicle.The actual communication attributes of the road are modeled and expressed,and a discriminant matrix is constructed to measure the connectivity characteristics of the road.A solution strategy based on neural combinatorial optimization is proposed.When solving the problem,a set of routing operators is designed,and iterative optimization is performed by performing a small range of operations on the initial feasible solution,which reduces the difficulty of the solution and improves the efficiency of the solution.(2)Proposed a macro search and rescue strategy based on reinforcement learning.This paper distinguishes rescue actions into low-level atomic actions and macroscopic actions.The bottom-level atomic actions are designed based on rules,and there is no strategic action for building selection at the macro level.Use reinforcement learning algorithms to train and learn.In this paper,based on the underlying atomic action,the building is selected as the action space,which reduces the state action space of reinforcement learning,improves the learning speed of rescue strategies,and enhances the performance of multi-agent collaborative engineering.(3)Constructed a multi-unmanned platform city search and rescue simulation environment for rescue verification.This environment combines the Robot World Cup imulation rescue environment and the Open AI Gym environment,so that mature reinforcement learning algorithms in the Python environment can be applied to the Robot World Cup simulation rescue environment.The fusion environment proposed in this paper provides a good environment in solving the multi-intelligent simulation rescue task,providing a good experimental platform for the multi-intelligent body consensus strategy algorithm. |