Font Size: a A A

Researches On Reinforcement Learning And Its Visual Navigation Application Techniques

Posted on:2024-04-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L LvFull Text:PDF
GTID:1528307079950739Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rise of Artificial Intelligence(AI)in recent years,the research and development of AI-powered human-like decision-making systems have attracted more and more researchers.Intelligent decision-making systems have vast influences on social life,and have made remarkable achievements nowadays in video games,board games,autonomous driving,mobile robots,and robot assistants.Reinforcement learning(RL)is a vital strategy for effectively solving decision-making problems,providing new visions and technical support for numerous AI applications in various fields.This dissertation mainly contains two aspects of RL-based intelligent decision-making: solving the unknown tabular Markov Decision Process(MDP)by focusing on structural properties,and addressing visual navigation in 3D indoor scenes using deep reinforcement learning.These two researches are typical instances in practical RL applications,and have profound significance for developing sounder intelligent decision-making algorithms.RL aims to learn how to act in an environment to maximize its return.Traditional reinforcement learning algorithms have encountered slow convergence problems,as an agent needs extensive interactions with its environment to attain sufficient experience for effective learning.However,the underlying MDP environment often exhibits similar or equivalent structural properties,and ignoring these structural properties is an essential factor leading to slow convergence for many RL algorithms.This dissertation proposes a new model-free RL algorithm that can leverage the equivalence structure in MDP to improve sample efficiency during interactions between the agent and the environment.Furthermore,a mathematical analysis of its high-probability sample complexity is provided.The proposed algorithm is able to utilize structural properties over state-action space to accelerate the convergence of value function and policy.Visual navigation in 3D indoor scenes involves interactions between environments and agents.A fundamental component of this interplay depends on understanding the correlation and causation between the agent’s behaviors and environmental changes.Deep Reinforcement Learning(DRL),which combines the perception ability of Deep Learning(DL)and the decision-making characteristics of RL,has become popular in visual navigation under complex environments.Although DRL-based visual navigation algorithms have achieved particular success,there still exists some difficulties and challenges:(1)comprehensively understanding navigation tasks,including accurately identifying and locating targets from observed images;(2)effective exploration strategies,which can make reasonable actions when targets are not within the agent’s field of view;(3)few-shot or zero-shot learning,which can infer possible locations when the agent has few or no experience in searching specified targets;(4)efficiency and safety,which can formulate reasonable and safe actions to find targets,such as avoiding obstacles and shortest paths;(5)generalization,which can conduct successful navigation for objects with different colors/shapes/sizes,and scenes with different layouts/backgrounds.To solve the above challenges,this dissertation investigates visual navigation models combining DRL algorithms and Knowledge Graph(KG).These models demonstrate their effectiveness in two domains: Target-driven Visual Navigation(TDVN)and Visual Semantic Navigation(VSN).TDVN aims to find a minimum sequence that moves an agent from its starting positions to targets specified by RGB images.The goal of VSN is to navigate from targets determined by object categories,the agent perceives its surroundings via egocentric views,and its targets are described as text words.Incorporating knowledge graph into visual navigation models has two advantages:(1)it encodes semantic connections and spatial relationships between different object categories;(2)As scene priors,it can provide spatial and semantic associations between known objects and novel objects,where the agent does not have any searching experience for novel targets.Intelligent decision-making is a remarkable manifestation and broad application for AI in numerous areas.Therefore,researches on RL algorithms and visual navigation approaches in 3D indoor scenes have surged in recent years.Underneath previous works,the contributions of this dissertation include the following three aspects:First,this dissertation explores RL algorithms that exploit the equivalence properties of MDP.We present a new model-free RL algorithm called QLES.It is a variant of the classic RL algorithm Q-learning,and can take advantage of the equivalence structure on stateaction space exhibited in MDP.Moreover,a mathematical analysis of a non-asymptotic PAC-type sample complexity bound is given,which characterizes its benefits by relative cover time and various numerical experiments.Second,this dissertation studies TDVN in 3D indoor scenes using DRL algorithms.We propose a model combining visual features and graph attention to learn the navigation policy.Deep residual networks are used to extract abstract features of observations.Graph convolutional networks are exploited to obtain graph features,which encode spatial relationships between objects.We also present a target skill extension module to generate sub-targets,enabling the agent to learn from its failed trajectories.Third,this dissertation investigates VSN in 3D indoor scenes using DRL algorithms.We propose a graph-based spatiotemporal attention model to guide policy search,which integrates the semantic information of observed objects and the spatial information of their locations.It consists of three attention units: the target attention unit learns to extract target-relevant information; the action attention unit considers the agent’s last action; the memory attention unit summarizes the historical experience,which is constructed from a3 D global graph and local graphs.Overall,this dissertation concentrates on exploitations of the state-action equivalence properties in underlying MDP,and DRL algorithms with KG for visual navigation in 3D indoor scenes.We demonstrate the proposed algorithms’ effectiveness from extensive experiments and mathematical analysis.
Keywords/Search Tags:Visual Navigation, 3D Indoor Scenes, Reinforcement Learning, Deep Reinforcement Learning, Knowledge Graph, Graph Neural Networks, Attention Mechanism
PDF Full Text Request
Related items