In recent years,artificial intelligence technology has developed rapidly,and various fields have relied on artificial intelligence technology to solve problems that are difficult to solve in their own fields.The field of unmanned vehicles,especially multi-unmanned vehicles,has also made great breakthroughs.Recently,reinforcement learning was introduced in vehicle group path planning,and a multi-agent reinforcement learning method based on communication was proposed to improve the effect.Each agent can communicate with other agents,which greatly improves the cooperation/competition relationship between agents.But existing methods assume that communication between agents is not restricted by any conditions.But in reality communication rate and communication distance are limited.Based on this Ⅰ propose a multi-agent reinforcement learning method based on communication constraints for path planning and implement a more realistic and complete simulation environment and path planning process.(1)Obtaining global information of unknown areas.I get video information or multiple image information collected by drones.Then,I obtain the road information of target areas,which provide global information for subsequent multi-agent path planning,through image stitching and semantic segmentation methods.(2)We built a vehicle reinforcement learning simulation platform with a communication simulation module,making our path planning method experimental scenario more realistic.The simulation platform is divided into three modules,namely the vehicle driving simulation module based on Carla,the vehicle communication simulation module based on OMNet++,and reinforcement learning module based on Actor-Critic network.(3)Considering the actual communication constraints between multiple vehicles,a path planning method based on communicationconstrained multi-agent reinforcement learning is proposed.This method adopts the idea of federated learning,which reduces communication overhead by passing network parameters instead of large amounts of complete data,and realizes cooperative pursuit path planning by designing reasonable state space,action space and reward function.Experiments show that the path planning algorithm designed in this thesis can cooperate in pursuit while reducing communication overhead under the premise of obtaining road information,meeting task requirements and improving learning effect. |