Research On Motion Planning In Dynamic Environment Based On Deep Reinforcement Learning

Posted on:2022-07-08

Degree:Master

Type:Thesis

Country:China

Candidate:L T Liu

Full Text:PDF

GTID:2518306569495464

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Navigation is one of the core basic functions of mobile robots,which requires the robot to achieve collision-free motion planning according to some optimality principle in the map.The motion planning algorithm under the static map is mature enough.How to achieve efficient collision-free paths in the dynamic environment is one of the frontier technologies in the field of mobile robots.Reinforcement learning is a kind of machine learning method which can learn through interaction with environments.In many tasks,reinforcement learning can achieve or even surpass human performances through continuous learning.It is one of the main research methods and directions to deal with the problem of the navigation in dynamic environment.This thesis focuses on the end-to-end dynamic navigation problem in which the raw sensor data are directly used as the input for learning and training.Firstly,the simulation platform of motion planning of mobile robot in dynamic environment is built based on Gazebo,and the state space,action space and reward function are determined based on the task requirement of the navigation in dynamic environment.Seven common dynamic interaction scenarios are defined for algorithm verification and testing.Considering the influence of correlation of time series information on navigation effect in dynamic environment,the problem of dynamic environment perception is defined as a sequence prediction problem,the temporal convolution model with attention mechanism is proposed to extract the features of key temporal information in the environment.Secondly,a maximum entropy reinforcement learning algorithm with value distribution and optimistic exploration is proposed to alleviate the inherent defects of maximum entropy reinforcement learning.The algorithm can effectively improve the learning efficiency and the performance of strategy exploration.Finally,according to the demand of high sample efficiency in robot learning,an adaptive priority experience replay of near policy samples is proposed from the perspective of introducing priority and alleviate the off-policyness,which can significantly improve the convergence efficiency and speed of robot learning.Finally,a comprehensive algorithm experiment is carried out in the simulation environment to demonstrate the outstanding performance of deep reinforcement learning applied to robot motion planning in the dynamic environment.In this thesis,the related algorithm implementation and data have been open source(https://github.com/Taospirit/Dynamic?Navigation)for other researchers to reproduce and research.

Keywords/Search Tags:

navigation in dynamic environment, temporal convolutional network, maximum entropy reinforcement learning, distributional reinforcement learning, priority experience replay, off-policy correction

PDF Full Text Request

Related items

1	Research On Optimization Methods Of The Experience Replay Mechanism For Off-policy Reinforcement Learning
2	Research On Experience Replay Method For Deep Reinforcement Learning
3	Research And Implementation On Game Control Algorithm Based On Deepening Reinforcement Learning
4	Research On Optimization Method Of Deep Reinforcement Learning Experience Replay
5	Improvement And Application Of Deep Reinforcement Learning Based On Experience Replay Mechanism
6	Deep Reinforcement Learning With Experience Replay
7	Study Of Robot Arm Control Based On Deep Reinforcement Learning
8	Research On Experience Replay In Deep Reinforcement Learning
9	Research On Goal-oriented Model-based Reinforcement Learning
10	Research On Security Deep Reinforcement Learning Based On Experiences