In recent years,intelligent warehouse systems have received increasing attention,and various types of warehouse systems have emerged,such as Amazon’s Kiva system and sorting centers.These systems essentially involve path planning for hundreds or thousands of intelligent agents,ensuring that they do not collide while quickly reaching their destination.With the development of deep reinforcement learning,it is now possible to solve more complex decision-making tasks,and using deep reinforcement learning for multi-agent path planning is a new research field in the field of artificial intelligence.Currently,the most advanced multi-agent path planning algorithms still rely on centralized planning,which is not suitable for real-world deployment.A decentralized framework based on reinforcement learning can learn the optimal planning strategy while mitigating real-time problems.However,it may lead to more vertex conflicts,thus reducing the planning success rate or prolonging the planning time.To address these issues,this thesis studies methods to improve the success rate of multi-agent path planning in uncertain environments using deep reinforcement learning and how to reduce collisions between multiple agents.The main research contents are as follows:(1)A multi-agent path planning method based on an improved A3C(Asynchronous Advantage Actor-Critic)algorithm is proposed to address the problem of online replanning in noisy and uncertain environments.This method combines reinforcement learning and imitation learning to learn a fully decentralized policy,enabling agents to perform real-time reactive path planning in environments with only partial observable information,while demonstrating implicit coordination.(2)A priority-based communication learning method is proposed to effectively avoid collisions in multi-agent path planning.This method combines an implicit priority learning module with traditional coupled planners,allowing multiple agents to dynamically determine communication topology while working in coordination.Information transmission and decision-making are performed based on the determined topology,effectively avoiding collisions.This thesis conducts research and exploration on multi-agent path planning,focusing on the issues of replanning paths in noisy and uncertain environments and effectively avoiding collisions.Two effective solutions are proposed using deep reinforcement learning.Finally,the effectiveness of the methods is validated in a grid environment based on the Asprilo benchmark. |