Font Size: a A A

Research On Reinforcement Learning Methods Based On Direct Policy Search

Posted on:2014-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:Q DaFull Text:PDF
GTID:2308330482952238Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning is one major research topic of machine learning fields, which aims at letting an agent improve itself through trial and error learning from the environment. With the development of modern supervised learning algorithms and optimalization techniques, how to apply supervised learning methods or optimalization methods to reinforcement learning problems have attracted much attention during past few years. This thesis is focusing on this topic, and makes serveral contributions summarized as follows:First, we propose the LEWE framework to let an agent improve a weak policy, by sampling tasks for the weak policy to execute, and then learning from the successful trajectories. Experiments show that LEWE can effectively and efficiently improve the weak policy.Second, we propose the Napping framework to speed functional policy gradient methods, by twice learning using random forest to improve the performance, while keeping a comparable or even less model complexity. Experiments in several domains show that the proposed method improves the performance significantly, and futher reduces the time cost during training and testing stages.Third, we propose the MAPLE framework of meta-policy learning which naturally adapts the policy gradient method, by taking the parameters of environment into consideration when building the model of policy. Empirical study verifies that the learned meta-policies generalize well on changing environment sampled from a given distribution.Four, we validate the proposed methods in a watering robot system.
Keywords/Search Tags:machine learning, reinforcement learning, direct policy search, imitation learning, policy gradient
PDF Full Text Request
Related items