Font Size: a A A

Reinforcement learning in environments with independent delayed-sense dynamics

Posted on:2009-10-16Degree:M.ScType:Thesis
University:University of Alberta (Canada)Candidate:Shahamiri, MasoudFull Text:PDF
GTID:2448390002997949Subject:Computer Science
Abstract/Summary:
This thesis is a detailed investigation into applying reinforcement learning to environments with independent delayed-sense dynamics (IDSD), where some of state variables evolve independently of both agent's actions and other state variables, and can be sensed only after a delay. These independent state variables are analogous to disturbances, since they are independent of control actions and are not observable before the agent commits a course of action.;In this thesis, we first formalize IDSD problems and then develop four reinforcement learning algorithms that exploit the structure of IDSD problems to achieve better efficiency. Two of the algorithms are partially model-based and two are model-free. We discuss that for the same amount of experiments the quality of the policy learned by the proposed algorithms is better than that of learned by conventional reinforcement learning algorithms.;We demonstrate the effectiveness of our algorithms by applying them to traffic grid-world problems and to a hybrid vehicle problem, in which the traffic and driver acceleration play the role of the independent state variable respectively. We show experimentally that our algorithms evaluate a given policy more accurately than the corresponding TD(0). We also show that in the case of control, the learning speeds of our algorithms are substantially higher than the learning speed of conventional reinforcement learning algorithms that do not use the knowledge of the IDSD structure.
Keywords/Search Tags:Reinforcement learning, Environments with independent delayed-sense dynamics, IDSD problems
Related items