Font Size: a A A

Mobile Robot Path Planning Based On DDPG Reinforcement Learning Network

Posted on:2020-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y D LiuFull Text:PDF
GTID:2428330590459680Subject:Engineering
Abstract/Summary:PDF Full Text Request
Path planning of mobile robots in unknown environments is the core issue of robot navigation,and it is also the focus of many scholars.Reinforcement learning,as an unsupervised algorithm based on reward and punishment mechanism,it can select the optimal action by interacting with the environment without any prior knowledge.It has strong practical value for robot path planning in unknown environment.Path planning tasks need to output continuous action space.Deep Deterministic Policy Gradient(DDPG)algorithm based on Actor-Critic framework effectively solves the problem of continuous action space by directly outputting actions through strategies.In this paper,DDPG reinforcement learning algorithm is used to study the path planning of mobile robots in unknown environments.This paper analyses the research status of path planning and reinforcement learning at home and abroad,explores the classification and development of reinforcement learning algorithm,and through the analysis and comparison of several existing reinforcement learning algorithms,finally chooses DDPG reinforcement learning algorithm combined with neural network for mobile robot path planning.The background knowledge related to deep neural network and DDPG algorithm,such as experience pool playback,is described in detail,which lays a theoretical foundation for the subsequent improvement of DDPG algorithm.In order to build the path planning framework of DDPG algorithm,a neural network model for the algorithm is designed according to the environment state and action space of the robot,and the reward function of the algorithm is designed according to the task of path planning.With the improvement of learning ability of DDPG algorithm,the fixed size of experience pool and sampling specifications can not meet the needs of multi-feature data,which limits the learning speed of the algorithm.In order to solve the above problems,this paper proposes a DDPG-vcep(Variable capacity experience pool)algorithm.By adding learning curve to the DDPG algorithm,the algorithm realizes real-time adjustment of the capacity of experience pool and sampling specifications according to its own learning curve,and improves the utilization rate of sample data.The simulation results show that the cumulative reward of DDPG-vcep algorithm under different training times is about 30% higher than that of the existing algorithm.The simulation environment is based on Python and Pyglet libraries.The environment restores the real environment information to the greatest extent,and the algorithm training is more efficient.Finally,the DDPG-vcep algorithm is used to carry out three-dimensional simulation and practical experiment based on Ubuntu system and ROS robot operating system framework.In Gazebo software,a simplified Roch robot model and obstacle environment are built to realize path planning in three-dimensional environment.In real laboratory environment,the real environment of path planning is built and real experiments are carried out with Roch robot.The results show that the Roch robot combined with DDPG-vcep algorithm can realize the collision-free path from the starting point to the target point in an unknown environment.
Keywords/Search Tags:Mobile robot, Reinforcement learning, Path planning, DDPG-vcep
PDF Full Text Request
Related items