In recent years,Reinforcement Learning(RL)method has been gradually applied to the path planning of mobile robots in order to solve the problem of path planning in unknown environment.Compared with the traditional path planning algorithm,it does not need to get the map environment information.Based on the development of artificial intelligence,the autonomous path planning algorithm of robot is deeply studied in this paper.The traditional reinforcement learning method has some problems in mobile robot path planning,such as slow training speed,unsmooth navigation path and difficult generalization of the model to new environments.Based on the Deep Q Network(DQN)algorithm,this paper makes the following improvements:(1)Based on the reinforcement learning DQN algorithm,combined with the network structure characteristics of Double DQN and Dueling DQN,a low complexity algorithm Double Dueling DQN(DDDQN)is proposed.A single network is not used to evaluate the quality of the action,that is,to calculate the Q value,but different value functions are used to select and evaluate the action,and the calculation of the final Q value is divided into the addition of a state value function V and an advantage function A.It greatly reduces the overestimation of Q value and improves the training effect and speed of the algorithm.(2)A Prioritized Experience Replay(PER)mechanism is added in the training of the DDDQN algorithm.Compared with the conventional DDDQN algorithm,it can improve the utilization of empirical data,shorten the training time,and improve the navigation effect;compared with the DDDQN algorithm which directly joins the PER mechanism,it may overfit in complex scenes,and at the same time,it ensures the possibility of taking more excellent actions in the early stage of training,and improves the generalization ability of the algorithm.(3)Optimize the environment setting of reinforcement learning.The reward function setting of robot autonomous navigation is improved and the hyperparameters of reinforcement learning are optimized,which improves the navigation effect and success rate of the algorithm.The experiment of the improved algorithm is carried out under the ROS framework,and different environment models are built in the gazebo scene.The horizontal comparison verifies the superiority of DDDQN algorithm and the superiority of DDDQN algorithm with PER mechanism.In order to verify the generalization of the algorithm,the models trained by various algorithms are transplanted to the new environment for visual verification.The experiment is carried out in the real environment.Experimental results show that compared with other path planning algorithms,the proposed DDDQN algorithm has faster training speed,better navigation effect and higher stability. |