Font Size: a A A

Research On Mapless Navigation Based On Reinforcement Learning

Posted on:2020-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:L L MaFull Text:PDF
GTID:2428330590974642Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Navigation is the ability of the robot to reach the target pose from the current position without any collision with obstacles in this process.It is one of the core functions of mobile robot.The existing mature technologies are all planned on the known environmental maps.In contrast,biological system can reach the target through loose or dense obstacles just with the approximate orientation or characteristics of the target,to realize mapless navigation.Reinforcement learning is an algorithm that an agent learns from interactions with the environment,it's is suitable for continuous decision-making tasks.And it has become the main direction of mapless navigation.This paper studies two different levels of mobile robot navigation without a map based on reinforcement learning.The palnner takes RGB image as visual input and the position relative to the goal as target information.I designed two end-to-end navigation policies,proposed a new planner that encoded the image to low-dimensional latent vector and input it to the decision network,which greatly improves the sampling efficiency,and proposed a stacked Long Short-Term Memory structure making the reinforcement learning network have reasoning ability.In order to test and compare different network structures and algorithms,a series of benchmark environments are built and the environment interface is provided for quick recall.Firstly,in the memory task,I proposed the PPO-based end-to-end navigation policy and built up a benchmark environment to compare it with the classic DQN-based end-toend navigation policy.Secondly,many network parameters in the usual end-to-end reinforcement learning network are used to extract image features.These parameters are updated together with the parameters in decision layers in the interaction with the environment,so that the sampling efficiency of the algorithm is reduced.In fact,the feature extraction of images does not require environmental information,that is,it can learn without interaction.I proposed an image compression method based on variational Auto-encoder(VAE),it compresses the RGB image into low-dimensional features and then the features are input together with other information to the decision layers,which improves the sampling efficiency of the reinforcement learning more than 2 times,and the performance is better.Secondly,the navigation task should not only include robust navigation to a specific point,but also include the reasoning ability to reach any specified target point in the free space of the environment.I proposed a stacked Long Short-term Memory structure with memory function.The module inputs visual and target information for decision making,and also inputs actions and reward values of last step to help the network better understand the task.The application of the recurrent neural network also makes the network have the ability of memorizing.The stacked structure realizes the reasoning ability.The planner based on stacked LSTM got the success rate of more than 60 percent in the test environment,which is one of the state-of-the-art results in the dense environment taking RGB as visual input.Finally,the experiment in real world was implemented to validate the reasoning planner.All the related algorithms proposed in this thesis and benchmark environments are open source for the deeper research: https://github.com/marooncn/navbot.
Keywords/Search Tags:mapless navigation, reinforcement learning, Proximal Policy Optimization, RGB image, sampling efficiency, reasoning, stacked long short-term memory
PDF Full Text Request
Related items