Font Size: a A A

Research On Path Planning Algorithm Based On Deep Reinforcement Learning

Posted on:2021-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:X M WuFull Text:PDF
GTID:2428330611496587Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Current path planning methods do not have the shortcomings of being able to respond to complex environment changes quickly,not being able to plan paths in real time,and slow convergence when facing unknown and complex environments.In recent years,with the continuous development of deep learning and reinforcement learning,the realization of mobile robot path planning tasks with deep reinforcement learning has been a research focus in the field of artificial intelligence.When traditional path planning algorithms are applied to unknown and complex environments,it is difficult to find a collision-free path.And deep reinforcement learning can enable the agent to learn the relevant experience,obstacle avoidance ability and ability to approach the target point while exploring the environment,so that the robot can obtain an optimal path through continuous "trial and error".Therefore,this subject has carried out research on path planning algorithms based on deep reinforcement learning.The main research of this paper is as follows:(1)Because the deep Q network algorithm's empirical replay buffer is stored in first-in,first-out mode,and the sampling method for later playback training is average sampling,this will result in low efficiency of empirical playback,which will cause the mobile robot to approach the target and the path finding process to be slow.Because the greedy strategy leads to incomplete map information,a PER-Noisy Net DQN algorithm model is proposed.When storing samples,give sample weights and send them to the network for sample training in order of priority.At the same time,the empirical playback buffer area retains important data sequences and removes high-similarity sequences.The fully connected layer of the deep Q network is modified.For the noise floor,to improve the exploration ability of the agent.The Open AI Gym platform has verified that the total reward value is increased by about 10% compared with the reward value of the original deep Q network,which proves that the accuracy of the mobile robot's approach to the target becomes higher.(2)Aiming at the way of action selection strategy of deep Q network,which caused local optimal solution,the path trajectory of the mobile robot was not optimal.A PER-Dueling DQN algorithm model was proposed.An adversarial network mechanism is introduced into the network structure,and when the agent selects an action,it judges whether the action can get a positive reward value and maximize the total revenue.The Open AI Gym platform and two-dimensional grid map experiment results show that PER-Dueling DQN has higher convergence efficiency than the original deep Q network algorithm,is more stable than the PER-Noisy Net DQN algorithm model,and the total reward value is increased by about 11% to 13%.Subsequent selection is based on PER-Dueling DQN algorithm for path planning research.(3)Finally,based on the ROS and Gazebo platforms,a three-dimensional barrier-free environment and a barrier-free environment were built.Three-dimensional simulation experiments were performed on the Turtle Bot3 mobile robot platform.From the experimental results,the total reward value tended to be stable at around 4000,and the maximum Q value was gradually gradually explored.The rise proves that the PER-Dueling DQN algorithm model is stable and the agent can learn the goal orientation ability and obstacle avoidance ability,and can effectively complete the path planning task.The trained model was transplanted to the robot platform,and the physical scene test was performed to realize the path planning task.
Keywords/Search Tags:Deep Reinforcement Learning, Path Planning, Robot, ROS operating system, Gazebo, TurtleBot3
PDF Full Text Request
Related items