Font Size: a A A

Hierarchical Reinforcement Learning

Posted on:2007-09-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ShenFull Text:PDF
GTID:1118360185966741Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning is an approach that an agent can learn its behivors through trial-and-error interaction with a dynamic environment. It has been an important branch of machine learning for its self-learning and online learning capabilities. But reinforcement learning is bedeviled by the curse of dimensionality. Recently, hierarchical reinforcement learning has made great progresses to combat the curse of dimensionality. There are several valuable works such as Option, HAM, and MAXQ. Where, Option and MAXQ are used more popularly. In Option framework, it is easy to automatically generate subtasks, esp. by partitioning regions or stages, and the granularity of subtask is easy to be controlled. But it is difficult to clearly describe the structure of subtasks and to learn the local strategies when these subtasks are constructed manually according to previous knowledge. The MAXQ approach has enough ability for online learning but weak ability for automatically discovering hierarchies. And besides, the granularity of subtask is not fine enough, and some large-scale subtasks can hardly be decomposed finer.In this dissertation, a novel approach of hierarchical reinforcement learning, named OMQ, by integrating Options into MAXQ is proposed. The theoretical and computational issues in OMQ are addressed as well as the rising problems in practice.The main contributions of this dissertation are:1) The OMQ approach for hierarchical reinforcement learning is presented and its theoretical framework and learning algorithm are discussed. The OMQ framework takes on the advantages of Option and MAXQ, i.e., the hierarchies not only can be constructed manually according to the previous knowledge but also can be generating automatically during learnig. Employing the result from stochastic approximation theory, an inductive proof is given that the OMQ learning algorithm converges with probability 1 to the unique recursively optimal policy in the same convergence condition as MAXQ. The experimental results show that the OMQ learning algorithm has better performance than that of...
Keywords/Search Tags:Hierarchical reinforcement learning, Immune clustering, Automatic hierarchy, Multi-agent hierarchical reinforcement learning
PDF Full Text Request
Related items