Hierarchical reinforcement learning in continuous state and multi-agent environments

Posted on:2006-04-02

Degree:Ph.D

Type:Thesis

University:University of Massachusetts Amherst

Candidate:Ghavamzadeh, Mohammad

Full Text:PDF

GTID:2458390008471896

Subject:Computer Science

Abstract/Summary:

This dissertation investigates the use of hierarchy and abstraction as a means of solving complex sequential decision making problems such as those with continuous state and/or continuous action spaces, and domains with multiple cooperative agents. This thesis develops several novel extensions to hierarchical reinforcement learning (HRL), and designs algorithms that are appropriate for such problems.; It has been shown that the average reward optimality criterion is more natural than the more commonly used discounted criterion for continuing tasks. This thesis investigates two formulations of HRL based on the average reward semi-Markov decision process (SMDP) model, both for discrete-time and continuous-time. These formulations correspond to two notions of optimality that have been explored in previous work on HRL: hierarchical optimality and recursive optimality. Novel discrete-time and continuous-time algorithms, termed hierarchically optimal average reward RL (HAR) and recursively optimal average reward RL (RAR) are presented, which learn to find hierarchically and recursively optimal average reward policies. Two automated guided vehicle (AGV) scheduling problems are used as experimental testbeds to empirically study the performance of the proposed algorithms.; Policy gradient reinforcement learning (PGRL) methods have several advantages over the more traditional value function RL algorithms in solving problems with continuous state spaces. However, they suffer from slow convergence. This thesis defines a family of hierarchical policy gradient RL (HPGRL) algorithms for scaling PGRL methods to high-dimensional domains.; This thesis also examines the use of HRL to accelerate policy learning in cooperative multi-agent tasks. The use of hierarchy speeds up learning in multi-agent domains by making it possible to learn coordination skills at the level of subtasks instead of primitive actions. Subtask-level coordination allows for increased cooperation skills as agents do not get confused by low-level details. A framework for hierarchical multi-agent RL is developed and an algorithm called Cooperative HRL is presented that solves cooperative multi-agent problems more efficiently. (Abstract shortened by UMI.)...

Keywords/Search Tags:

Multi-agent, Continuous state, Reinforcement learning, HRL, Hierarchical, Average reward, Cooperative

Related items

1	Multi-Agent Dynamic Hierarchical Reinforcement Learning Based On Hybrid Abstraction
2	Continuous Time Hierarchical Reinforcement Learning Algorithm
3	Research Of Multi-agent Cooperation Mechanism Based On Reinforcement Learning
4	Study On The Improved Average Reward Reinforcement Learning Algorithm Based On Performance Potentials
5	Research On The Sparse Reward Problem Based On Hierarchical Reinforcement Learning
6	Inverse Reinforcement Learning Under Average Reward Criterion
7	Research On Multi-Agent Reinforcement Learning Under Sparse Reward Scenario
8	Research On Mean-Field Multi-Agent Reinforcement Learning In Large Scale Scenarios
9	Reward Mechanism Research Of Reinforcement Learning-based Continuous Integration Test Case Prioritization
10	Hierarchical Reinforcement Learning