Researches On Hierarchical Reinforcement Learning Based On Abstract Actions

Posted on:2017-04-22

Degree:Master

Type:Thesis

Country:China

Candidate:Z P Xu

Full Text:PDF

GTID:2308330488961930

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Reinforcement learning has a good ability of automous learning in complex systems, which has been widely used in pratical applications. However, the development of reinforcement learning is also subject to the “Curse of dimensionality” problem. Hierarchical reinforcement learning decomposes the learning task into multiple subtasks and solves them respectively, which can effectively solve the “Curse of dimensionality” problem in reinforcement learning. The Option framework is one of the three hierarchical reinforcement learning frameworks, this paper puts forward several control optimization and automatic abstraction hierarchical reinforcement learning methods based on the Option framework. The main researches are concluded as follows:i. In order to address the problem that the traditional abstract action based methods can not solve learning and control problem in dynamic environment, we propose an online learning algorithm using interrupting abstract actions called IMQ and prove its convergence theoretically. IMQ algorithm can effectively solve the problem of large scale data that traditional reinforcement learning methods are unable to deal with. IMQ algorithm combines the ideas of interruption with the characteristics of dynamic environment, improving the efficiency of learning and controlling strategy in the dynamic environment.ii. With respect to the problem that identifing the sub-goals requires a lot of time because of too large trajectory sampling noise when using diverse density based abstraction discovery method, we propose a new algorithm for autonomous discovery abstract actions using acyclic state trajectories based on the metric of diverse density. This algorithm can effectively reduce the learning time and optimize the abstract actions by reducing the noise of the trajectory samples. The algorithm avoids the problem of large amount of computation due to excessive sampling, not only reducing the time needed to identify the sub-goals, but also being able to discovery better abstract actions, which improves the learning efficiency of the algorithm.iii. In view of the problem that the traditional automatic hierarchical DT-SMDP based methods cannot be directly used to continuous-time infinite task problem, we put forward a new CT-SMDP based automatic hierarchical reinforcement learning method to solve finite tasks of continuous-time. The algorithm has good effect on controlling and learning when solving the continuous-time task.

Keywords/Search Tags:

hierarchical reinforcement learning, Option, abstraction, automatic hierarchy, automatic discovery

PDF Full Text Request

Related items

1	Research On An Approach Of Hierarchical Reinforcement Learning Based On Option Automatic Generation
2	Hierarchical Reinforcement Learning
3	Research Of Reinforcement Learning Based On Clustering Analysis
4	A Study Of Hierarchical Reinforcement Learning Algorithm Based On Fuzzy Clustering
5	Research On State Abstraction For Reinforcement Learning
6	Continuous Time Hierarchical Reinforcement Learning Algorithm
7	Research On Hierarchical Reinforcement Learning Based On Action Space Partitioning
8	Research On Hierarchy Reinforcement Learning Algorithm And Its Application
9	Service Composition Via Automatic Hierarchical Reinforcement Learning
10	Multi-Agent Dynamic Hierarchical Reinforcement Learning Based On Hybrid Abstraction