Font Size: a A A

Research And Implementation Of Deep Reinforcement Learning Algorithm For Visual Perception And Navigation

Posted on:2020-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y C WuFull Text:PDF
GTID:2428330572487963Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the development of computer vision,making decisions intelligently based only on visual perception becomes more important.Recently,deep reinforcement,learning is leverage to obtain the intelligence of making decisions with visual perception.However,deep reinforcement learning with deep neural network as function approximator,has some problems such as training unstably and sample inefficient.When depending only on visual observation as input,agent hardly learns effectively and makes appropriate decisions.Focusing on the above limitations of deep reinforcement learning,this paper proposes two algorithms to relieve t.hem.Curriculum learning is often introduced as a leverage to improve the agent training for complex tasks,where the goal is to generate a sequence of easier subasks for an agent to train on,such that final performance or learning speed is improved.This work presents a novel curriculum learning strategy by introducing the concept of master-slave agents and enabling flexible action setting for agent training.Multiple agents are trained concurrently within different action spaces by sharing a perception network with an asynchronous strategyExtensive evaluation on the VizDoom platform demonstrates the joint learning of master agent,and slave agents mutually benefit,each other.Significant improvement is obtained over A3C in terms of learning speed and performance.Learning to adapt to a series of different goals in visual navigation is challenging.This work presents a model-embedded actor-critic architecture for the multi-goal visual nav-igation task.To enhance the task cooperation in multi-goal learning,two new designs are introduced to the reinforcement learning scheme:inverse dynamics model(InvD-M)and multi-goal co-learning(MgCl).Specifically,InvDM is proposed to capture the navigation-relevant association between state and goal,and provide additional training signals to relieve the sparse reward issue.MgCl aims at,improving the sample efficiency and supports the agent to learn from unintentional positive experiences.Extensive results on the interactive platform AI2-THOR.demonstrate that the proposed met.hod converges faster than state-of-theart methods while producing more direct routes to navigate to the goal.
Keywords/Search Tags:Deep reinforcement learning, Perception and decision, Curriculum learning, Visual navigation
PDF Full Text Request
Related items