Autonomous Mission Decomposition Based On Hierarchical Reinforcement Learning

Posted on:2007-10-24

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Tang

Full Text:PDF

GTID:2178360185975659

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Reinforcement Learning is an effective method to solve the plan problem in the stochastic environment .However, in the large state space, especial for application problems with complex stochastic states, the "dimension curse" problem hasn't been solved yet. At present, the Hierarchical Reinforcement Learning which develop from Reinforcement Learning in state space and action space, has been proven to be more effective to solve the large scale state stochastic control problem, and be applicable in the AGV navigation . Now for almost all researches the hierarchical structures are designed in advance and there has been relatively little research on autonomously discovering or creating useful hierarchies. Based on this idea ,the following aspects are investigated and discussed.The basic theory background and development of hierarchical reinforcement learning are introduced .Three typical RL algorithms are discussed and compared. The empirical results are presented to show their differences and characteristics, which offer a basis to choose the right algorithm in the following work.The methods to find the useful subgoal autonomously in the two different environments are studied. When the environment is simple, as the learning speed which McGovern's method to learn the model is too slow, the Actor-Critic method based on the Borelzman distributions is proposed to learn the environment; as for to create the subgoal autonomously , the thesis firstly analyze the models's properties ,and propose an new concept : frequency change ratio. then choose the state with the max frequency change ratio on the learned policy model' s properties as the subgoal.When the environment is relatively complex, the heuristic method is used to create the subgoal action sequence, then delete the state which isn't on the success path to form the useful subgoal.At last the thesis improve the method which McGovern used to form the hierarchical policy. Firstly , a class of SARSA algorithm based on the heuristic search is proposed to determine the agent's action option .Then the new subgoals are added as new abstract action into the old action set to form the hierarchical policy.

Keywords/Search Tags:

autonomous mission decomposition, hierarchical reinforcement learning, heuristic search, hierarchical policy, abstract

PDF Full Text Request

Related items

1	Researches On Hierarchical Reinforcement Learning Based On Abstract Actions
2	Hierarchical Reinforcement Learning
3	Research On Hierarchical Reinforcement Learning Based On Action Space Partitioning
4	The Decomposition And Reconstruction Of Complex Environment In Reinforcement Learning
5	Research On The Sparse Reward Problem Based On Hierarchical Reinforcement Learning
6	Hierarchical Reinforcement Learning And Its Application To Obstacle Avoidance Problem For Manipulator
7	Continuous Time Hierarchical Reinforcement Learning Algorithm
8	Reinforcement Learning Based On Spectral Graph Theory
9	Hierarchical reinforcement learning using automatic task decomposition and exploration shaping
10	Research On Quantitative Strategy Based On Hierarchical Deep Reinforcement Learning