| As the use of robots becomes more widespread,mobile robots are increasingly being introduced in traditional manufacturing and service industries.The most core and fundamental technology for mobile robots is path planning.ROS(Robot Operating System)is the standard operating system,and path planning is divided into two parts:global and local.However,the current official ROS path planner suffers from high navigation resource consumption and slow response,which cannot cope with the growing demand for robot applications.To address the above situation,this thesis investigates a robot path planning method that fuses reinforcement learning and heuristic search algorithms to produce the DDPGWOA+A2C path planner.The path planner uses a new heuristic algorithm DDPGWOA obtained by improving the Whale Optimization Algorithm(WOA)based on the Deep Deterministic Policy Gradient(DDPG)for global planning,and the local planning aspect uses the Advantage Actor Critic(A2C).The main research elements are as follows.Firstly,a new heuristic algorithm DDPGWOA that improves the WOA algorithm based on DDPG is proposed.First,the traditional WOA algorithm of following the optimal position is improved to follow the optimal agent thus increasing the algorithm’s exploration capability.Then DDPG is used to calculate the control parameters of the WOA algorithm,which changes the fixed pattern of the original algorithm of exploring first and then developing,thus improving the intelligence of the WOA algorithm.Finally,the DDPGWOA was tested on benchmark test functions,the results showed that DDPGWOA had more advantages in terms of both convergence performance and stability.Secondly,we have investigated global path planning for robots with heuristic search algorithms,including the establishment of a global path planning model,a path initialization method and a path solving fitness function.This thesis uses the DDPGWOA algorithm to address the path planning model.We also conducted path planning comparison experiments with the DDPGWOA algorithm against the mothflame algorithm,the grey wolf optimization algorithm,the particle swarm algorithm,and the traditional whale optimization algorithm to verify the path planning capability of the DDPGWOA algorithm.Thirdly,we produced a robot local path planner using the A2 C algorithm.First,we built the training platform and designed the reward function,environment state and execution action.Then the A2 C agent was trained using the incremental training method.Finally,the A2 C local path planner completed the challenge of navigating in an unknown environment in both simulated and real scenarios,which validated the navigation performance of the A2 C local path planner.Fourthly,we verified the navigation performance and practicality of the DDPGWOA+A2C path planner.First,the experimental platform was built in both simulation and real environments,and then the navigation comparison experiments were conducted with the official path planner provided by ROS.The experimental results showed that DDPGWOA+A2C is superior in terms of navigation path length,navigation completion time and planning response speed.Therefore,the DDPGWOA+A2C path planner can improve the efficiency of mobile robots. |