Font Size: a A A

Design and analysis of efficient reinforcement learning algorithms

Posted on:1998-11-25Degree:Ph.DType:Dissertation
University:University of PittsburghCandidate:Fiechter, Claude-NicolasFull Text:PDF
GTID:1468390014973987Subject:Computer Science
Abstract/Summary:
Reinforcement learning considers the problem of learning a task or behavior by interacting with one's environment. The learning agent is not explicitly told how the task is to be achieved and has to learn by trial-and-error, using only the rewards and punishments that it receives in response to the actions it takes.; In the last ten years there has been a rapidly growing interest in reinforcement learning techniques as a base for intelligent control architectures. Many methods have been proposed and a number of very successful applications have been developed.; This dissertation contributes to a theoretical foundation for the study of reinforcement learning by applying some of the methods and tools of computational learning theory to the problem. We propose a formal model of efficient reinforcement learning based on Valiant's Probably Approximately Correct (PAC) learning framework, and use it to design reinforcement learning algorithms and to analyze their performance.; We describe the first polynomial-time PAC algorithm for the general finite-state reinforcement learning problem and show that an active and directed exploration of its environment by the learning agent is necessary and sufficient to obtain efficient learning for that problem. We consider the trade-off between exploration and exploitation in reinforcement learning algorithms and show how in general an off-line PAC algorithm can be converted into an on-line algorithm that efficiently balances exploration and exploitation.; We also consider the problem of generalization in reinforcement learning and show how in some cases the underlying structure of the environment can be exploited to achieve faster learning. We describe a PAC algorithm for the associative reinforcement learning problem that uses a form of decision lists to represent the policies in a compact way and generalize across different inputs. In addition, we describe a PAC algorithm for a special case of reinforcement learning where the environment can be modeled by a linear system. This particular reinforcement learning problem corresponds to the so-called linear quadratic regulator which is extensively studied and used in automatic and adaptive control.
Keywords/Search Tags:Reinforcement learning, Problem, PAC algorithm, Environment
Related items