Font Size: a A A

Research On Motion Planning In Dynamic Environment Based On Deep Reinforcement Learning

Posted on:2022-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:L T LiuFull Text:PDF
GTID:2518306569495464Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Navigation is one of the core basic functions of mobile robots,which requires the robot to achieve collision-free motion planning according to some optimality principle in the map.The motion planning algorithm under the static map is mature enough.How to achieve efficient collision-free paths in the dynamic environment is one of the frontier technologies in the field of mobile robots.Reinforcement learning is a kind of machine learning method which can learn through interaction with environments.In many tasks,reinforcement learning can achieve or even surpass human performances through continuous learning.It is one of the main research methods and directions to deal with the problem of the navigation in dynamic environment.This thesis focuses on the end-to-end dynamic navigation problem in which the raw sensor data are directly used as the input for learning and training.Firstly,the simulation platform of motion planning of mobile robot in dynamic environment is built based on Gazebo,and the state space,action space and reward function are determined based on the task requirement of the navigation in dynamic environment.Seven common dynamic interaction scenarios are defined for algorithm verification and testing.Considering the influence of correlation of time series information on navigation effect in dynamic environment,the problem of dynamic environment perception is defined as a sequence prediction problem,the temporal convolution model with attention mechanism is proposed to extract the features of key temporal information in the environment.Secondly,a maximum entropy reinforcement learning algorithm with value distribution and optimistic exploration is proposed to alleviate the inherent defects of maximum entropy reinforcement learning.The algorithm can effectively improve the learning efficiency and the performance of strategy exploration.Finally,according to the demand of high sample efficiency in robot learning,an adaptive priority experience replay of near policy samples is proposed from the perspective of introducing priority and alleviate the off-policyness,which can significantly improve the convergence efficiency and speed of robot learning.Finally,a comprehensive algorithm experiment is carried out in the simulation environment to demonstrate the outstanding performance of deep reinforcement learning applied to robot motion planning in the dynamic environment.In this thesis,the related algorithm implementation and data have been open source(https://github.com/Taospirit/Dynamic?Navigation)for other researchers to reproduce and research.
Keywords/Search Tags:navigation in dynamic environment, temporal convolutional network, maximum entropy reinforcement learning, distributional reinforcement learning, priority experience replay, off-policy correction
PDF Full Text Request
Related items