Font Size: a A A

Agent Navigation Based On Deep Reinforcement Learning

Posted on:2022-10-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:F Y ZengFull Text:PDF
GTID:1488306524470544Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Navigation is a key technology for agents to adapt to their surrounding environments and interact with the environment,and it is also a prerequisite for agents to perform other more advanced behaviors.Navigation based on deep reinforcement learning is an endto-end approach in which navigation policies are inferred directly from visual images of the environment.Meanwhile,it can serve as the basis for an AI-based navigation approach and can be easily integrated with other AI tasks to perform more intelligent tasks.Therefore,more and more researchers are engaged in visual navigation based on deep reinforcement learning.Although deep reinforcement learning has achieved important research results in many fields,there are still some key issues to be solved when it faces navigation tasks in complex environments,especially in the environments where states are partially observable or rewards are sparse.The issues aimed to be solved in this dissertation include:how to solve the training instability of navigation based hierarchical deep reinforcement learning,and how to enhance the learning efficiency of agents by making internal rewards dense;How to make an agent have memory ability,and improve its navigation performance in partially observable environment;Based on the way of human learning,how to impart the prior knowledge of navigation to an agent for its learning speed and transferability improvement;There is no navigation training environment suitable for building decoration robot,how to design this environment to ensure that the intelligent agent can successfully learn to navigate in decoration scenes.To address the four issues,we carry out related research,and the contributions of this dissertation are as follows:(1)To solve the training instability of navigation based hierarchical reinforcement learning,hierarchical reinforcement learning based on continuous subgoals for navigation is proposed.First,deep deterministic policy gradient and proximal policy optimization are taken as high-level agent and low-level agent respectively,to ensure that hierarchical policy generates stable experience data in the training process.Then,a new internal reward function is designed to make the agent consider both subgoals and high reward goals,which makes internal rewards dense.Finally,the experimental results show that the hierarchical policy can achieve effective navigation when the external rewards of environments are sparse.Especially in complex and large-scale environments,the advantages of hierarchical reinforcement learning based on continuous subgoals become more obvious.(2)To augment navigation agents with memory ability,a memory-augmented episodic value network is proposed.First,an agent's action value function is associated with an episodic memory module to output the weighted sum of the action value function with similar states,so as to improve its navigation performance in partially observable environment.Meanwhile,an embedded network structure is introduced to ensure the differentiability of the neural network which can use error back propagation for its training.Compared with the navigation agents without episodic memory,the agents based on memory-augmented episodic value network have better navigation performance in partially observable environments.(3)To impart the prior knowledge of navigation to an agent,a visual navigation method of deep reinforcement learning via tutor mechanism is proposed.First,based on full probability formula,a decision model guided by a tutor is established,and a loss function is constructed by using policy gradient formula.Then,the structure of tutor-student neural network for navigation is designed.The tutor module provides the student module with effective auxiliary navigation information,hence the student module can perform effective navigation.Finally,the experimental results show that the auxiliary navigation information provided by the tutor module accelerates the learning speed of the agent and makes it obtain higher accumulated rewards.More importantly,the visual navigation model of deep reinforcement learning based on the tutor mechanism can be generalized well to new and unseen environments.(4)To solve the lack of navigation training environment for building decoration robots,a reinforcement learning navigation environment for decoration scenes is designed.First,we analyze and compare the widely-used training environments for deep reinforcement learning,and selects a suitable game engine as the basis of navigation environment for decoration scenes.Then,according to the features of real decoration scenes and existing house floor layouts,we design and build a navigation environment.Meanwhile,the experimental results show that the reinforcement learning navigation environment for decoration scenes is useful,and the navigation agents based on deep reinforcement learning can be trained successfully in this environment.Finally,we have disclosed the training environment,and hope that it can promote the development of visual navigation based on deep reinforcement learning.
Keywords/Search Tags:deep reinforcement learning, visual navigation, episodic memory, tutor mechanism, navigation training environments
PDF Full Text Request
Related items