Hierarchical Reinforcement Learning

Posted on:2007-09-22

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Shen

Full Text:PDF

GTID:1118360185966741

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Reinforcement learning is an approach that an agent can learn its behivors through trial-and-error interaction with a dynamic environment. It has been an important branch of machine learning for its self-learning and online learning capabilities. But reinforcement learning is bedeviled by the curse of dimensionality. Recently, hierarchical reinforcement learning has made great progresses to combat the curse of dimensionality. There are several valuable works such as Option, HAM, and MAXQ. Where, Option and MAXQ are used more popularly. In Option framework, it is easy to automatically generate subtasks, esp. by partitioning regions or stages, and the granularity of subtask is easy to be controlled. But it is difficult to clearly describe the structure of subtasks and to learn the local strategies when these subtasks are constructed manually according to previous knowledge. The MAXQ approach has enough ability for online learning but weak ability for automatically discovering hierarchies. And besides, the granularity of subtask is not fine enough, and some large-scale subtasks can hardly be decomposed finer.In this dissertation, a novel approach of hierarchical reinforcement learning, named OMQ, by integrating Options into MAXQ is proposed. The theoretical and computational issues in OMQ are addressed as well as the rising problems in practice.The main contributions of this dissertation are:1) The OMQ approach for hierarchical reinforcement learning is presented and its theoretical framework and learning algorithm are discussed. The OMQ framework takes on the advantages of Option and MAXQ, i.e., the hierarchies not only can be constructed manually according to the previous knowledge but also can be generating automatically during learnig. Employing the result from stochastic approximation theory, an inductive proof is given that the OMQ learning algorithm converges with probability 1 to the unique recursively optimal policy in the same convergence condition as MAXQ. The experimental results show that the OMQ learning algorithm has better performance than that of...

Keywords/Search Tags:

Hierarchical reinforcement learning, Immune clustering, Automatic hierarchy, Multi-agent hierarchical reinforcement learning

PDF Full Text Request

Related items

1	A Study Of Hierarchical Reinforcement Learning Algorithm Based On Fuzzy Clustering
2	Research Of Reinforcement Learning Based On Clustering Analysis
3	Multi-Agent Dynamic Hierarchical Reinforcement Learning Based On Hybrid Abstraction
4	Research On An Approach Of Hierarchical Reinforcement Learning Based On Option Automatic Generation
5	Researches On Hierarchical Reinforcement Learning Based On Abstract Actions
6	Research On Goal-Conditioned Hierarchical Multi-Agent Reinforcement Learning For Cooperative Environment
7	Continuous Time Hierarchical Reinforcement Learning Algorithm
8	Research On Key Technologies Of Multi-agent Cooperation Problems Based On Reinforcement Learning
9	Research On Hierarchical Reinforcement Learning Based On Action Space Partitioning
10	A Research Of Hierarchical Multi-agents Deep Reinforcement Learning For Action Game