Font Size: a A A

Deep Reinforcement Learning Based Mobile Robot Path Planning

Posted on:2022-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y P ZhaoFull Text:PDF
GTID:2518306746986329Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of robotics,the application area of mobile robots is expanding,and path planning tasks in unknown and complex environments have become a hot research topic.Most traditional path planning algorithms rely on the representation of the environment and do not have the ability to quickly respond to environmental changes.Therefore,research on path planning methods that enable mobile robots to plan their own paths through autonomous learning and decision making in unknown,complex environments,thereby improving their environmental adaptation capabilities,has become an urgent problem.In 2016,one of the main techniques used in Alpha Go's victory over human chess players was Deep Reinforcement Learning(DRL).It is an important technology for solving complex tasks today because of its ability to extract high-level features from information by Deep Learning(DL)and its autonomous learning and decision-making capabilities by Reinforcenent Learning(RL).Spiking Neural Networks(SNNs)are plausible Neural Networks.Spiking Neurons are used as computational units,which incorporate temporal and spatial information to simulate the encoding and processing of information in the human brain,and transmit information through the precise timing of spike sequences,thus providing sparse but powerful computational power.Meanwhile,SNNs are also easy to implement in hardware,and the emergence of neuromorphic-specific hardware in recent years has the potential to meet the rapidly growing needs of artificial intelligence with lower energy consumption.We combine the advantages of DRL and SNNs to study the DRL-based path planning method for mobile robots.To address the problems of unsmooth path planning and slow convergence of existing DRL path planning algorithms,the TPR-DDPG,Spike DDPG and Spike DQN algorithms are proposed for the path planning task of mobile robots with the mobile robot Pioneer3-DX as the research object.The details are as follows:Firstly,the research background and significance of the path planning task for mobile robots in unknown,complex environments and the current status of research on DRL,and SNNs are introduced,and the current status of research on DRL-based path planning methods for mobile robots are discussed.Secondly,the basic concepts of RL and DRL involved in the mainstream algorithms Q-Learning,SARSA,Deep Q Network(DQN)and Deep Deterministic Policy Gradient(DDPG)are introduced.And the basics of SNNs encoding,neuron model and Spatio-Temporal Back Propagation(STBP)are presented.Thirdly,the path planning for mobile robot based TPR-DDPG is proposed,which focuses on the algorithm flow,ACTOR and CRITIC network structure,three-part reward function,and state preprocessing.Note that,the effectiveness of TPR-DDPG is evaluated by experiments on different complexity environments,different initial azimuth angles,different initial pose and target point,and compared with Q-learning.The experimental results show that an optimal path without obstacles can be found in unknown,complex environments using the TPR-DDPG algorithm than Q-Learning in a smooth manner.Fourthly,a directly trainable PIPLIF model of Spiking Neurons is proposed,which is combined with the TPR-DDPG algorithm to propose a Spike DDPG-based path planning algorithm for mobile robots.The algorithm flow,Spike ACTOR network structure and learning process,and the state spike encoding are highlighted.Experiments with different state spike encoding,different starting poses and target points are conducted in different complexity environments,and the performance of Spike DDPG and TPR-DDPG with different simulation time window lengths are compared.The experimental results show that the Spike DDPG(T=10 timesteps)algorithm with direct encoding of states in an unknown,complex environment finds an obstacle-free optimal path,and compared with the TPR-DDPG algorithm,it has better results in both convergence speed and robot walking speed.Fifthly,a Q-value spike decoding method and a Spike DQN-based path planning algorithm for mobile robots are proposed.A detailed derivation of the Q-value spike decoding process is presented,focusing on the algorithm flow,spike value network,state spike encoding,and discrete action design.By setting up comparison experiments in different complexity environments to compare the performance of DQN with Spike DQN under different state spike encoding,the experimental results show that the Spike DQN(T=10timesteps)algorithm with direct encoding of states can successfully plan collision-free paths in unknown,complex environments and converge faster than DQN.Finally,we will conclude our paper work and look forward to the next step.
Keywords/Search Tags:Mobile Robot, Deep Reinforcement Learning, Spiking Neural Network, Integrate and Fire Neuron Model, Deep Deterministic Policy Gradient, Deep Q Network
PDF Full Text Request
Related items