Research On Reinforcement Learning Methods Based On Direct Policy Search

Posted on:2014-10-25

Degree:Master

Type:Thesis

Country:China

Candidate:Q Da

Full Text:PDF

GTID:2308330482952238

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Reinforcement learning is one major research topic of machine learning fields, which aims at letting an agent improve itself through trial and error learning from the environment. With the development of modern supervised learning algorithms and optimalization techniques, how to apply supervised learning methods or optimalization methods to reinforcement learning problems have attracted much attention during past few years. This thesis is focusing on this topic, and makes serveral contributions summarized as follows:First, we propose the LEWE framework to let an agent improve a weak policy, by sampling tasks for the weak policy to execute, and then learning from the successful trajectories. Experiments show that LEWE can effectively and efficiently improve the weak policy.Second, we propose the Napping framework to speed functional policy gradient methods, by twice learning using random forest to improve the performance, while keeping a comparable or even less model complexity. Experiments in several domains show that the proposed method improves the performance significantly, and futher reduces the time cost during training and testing stages.Third, we propose the MAPLE framework of meta-policy learning which naturally adapts the policy gradient method, by taking the parameters of environment into consideration when building the model of policy. Empirical study verifies that the learned meta-policies generalize well on changing environment sampled from a given distribution.Four, we validate the proposed methods in a watering robot system.

Keywords/Search Tags:

machine learning, reinforcement learning, direct policy search, imitation learning, policy gradient

PDF Full Text Request

Related items

1	Research On Policy Learning Via Imitation
2	Theories, Algortihms And Applications Of Policy Gradient Reinforcement Learning
3	Research On Fast Policy Gradient Algorithms Of Reinforcement Learning Based On Adaptive Learning Rate
4	Robust Policy Gadient Algorithm Based On Actor-Critic In Deep Reinforcement Learning
5	Research On Multiagent Cooperation And Applications Based On Reinforcement Learning
6	Deep Deterministic Policy Gradient Based On Entropy Regularization And Regular Update
7	Research On Regularized Policy Gradient
8	Research Of Game Intelligence Based On Improved Policy Gradient Method
9	Recursive Least-squares Reinforcement Learning Based On An Improved Extreme Learning Machine
10	Deep Reinforcement Learning Based On Policy Gradient Optimization And Its Application In Agent Control