Font Size: a A A

Hierarchical Reinforcement Learning And Its Application To Obstacle Avoidance Problem For Manipulator

Posted on:2013-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:X D JinFull Text:PDF
GTID:2248330362474928Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Based on the MDP, Hierarchical Reinforcement Learning introductions abstractionand task decomposition mechanisms from the perspective of the actions, states andstrategies to construct the hierarchical structure.Following "divide and conquer" policy,agent learns in different levels of abstraction respectively to overcome the curse ofdimensionality effectively. In complicated learning task, Hierarchical ReinforcementLearning has wide application prospects.Reinforcement learning that was layered by human can not adapt to the increasinglycomplex and volatile environment.To solve complex large-scale problem, giving agentabilities to self-exploration,task decomposition and hierarchical structure construction inlearning process has become a hot research topic, and subgoal identification problem isone of the most active spot in this field.In this paper, the following aspects were studied and discussed.First, the research progress and the theory of hierarchical reinforcement learningmethods are summarized and compared.Secondly, proposed an improved online subgoal discover method based on theamount of information of the action. The method uses the amount of goal informationcontained in the action that the agent makes to distinguish different states, to findcritical state for achieve the learning task.Based on that,the paper designed algorithm toautomatically online discover a subgoal in the large state space, thus achievingautomatic task decomposition, and then using a maze routing framework consisted withtwo-dimensional obstacles grid to do the simulation experiment,compared with theclassical algorithmand and verified the proposed method.Finally, using the autonomous obstacle avoidance of redundant serial manipulatorwith three degrees of freedom for application background, designed a manipulatorcollision avoidance hierarchical reinforcement learning model.The model mapping theoriginal high dimensional joint space trajectory learning problem to the low dimensionpath learning problem,and added into the auto-layered approach proposed in this paper,automatically find the critical path points in the early learning and generate large sizetrajectory fragments to obstacle avoidance learning. Finally, using the ODE physicsengine to build a manipulator simulation platform, and by contrast No-hierarchical andhierarchical learning, verified the effectiveness and feasibility of hierarchical reinforcement learning in solving the high dimensions and large-scale learningproblems.
Keywords/Search Tags:hierarchical reinforcement learning, automatic decomposition, subgoal, mutual information, redundant manipulator, obstacle avoidance
PDF Full Text Request
Related items