Design and analysis of efficient reinforcement learning algorithms

Posted on:1998-11-25

Degree:Ph.D

Type:Dissertation

University:University of Pittsburgh

Candidate:Fiechter, Claude-Nicolas

Full Text:PDF

GTID:1468390014973987

Subject:Computer Science

Abstract/Summary:

Reinforcement learning considers the problem of learning a task or behavior by interacting with one's environment. The learning agent is not explicitly told how the task is to be achieved and has to learn by trial-and-error, using only the rewards and punishments that it receives in response to the actions it takes.; In the last ten years there has been a rapidly growing interest in reinforcement learning techniques as a base for intelligent control architectures. Many methods have been proposed and a number of very successful applications have been developed.; This dissertation contributes to a theoretical foundation for the study of reinforcement learning by applying some of the methods and tools of computational learning theory to the problem. We propose a formal model of efficient reinforcement learning based on Valiant's Probably Approximately Correct (PAC) learning framework, and use it to design reinforcement learning algorithms and to analyze their performance.; We describe the first polynomial-time PAC algorithm for the general finite-state reinforcement learning problem and show that an active and directed exploration of its environment by the learning agent is necessary and sufficient to obtain efficient learning for that problem. We consider the trade-off between exploration and exploitation in reinforcement learning algorithms and show how in general an off-line PAC algorithm can be converted into an on-line algorithm that efficiently balances exploration and exploitation.; We also consider the problem of generalization in reinforcement learning and show how in some cases the underlying structure of the environment can be exploited to achieve faster learning. We describe a PAC algorithm for the associative reinforcement learning problem that uses a form of decision lists to represent the policies in a compact way and generalize across different inputs. In addition, we describe a PAC algorithm for a special case of reinforcement learning where the environment can be modeled by a linear system. This particular reinforcement learning problem corresponds to the so-called linear quadratic regulator which is extensively studied and used in automatic and adaptive control.

Keywords/Search Tags:

Reinforcement learning, Problem, PAC algorithm, Environment

Related items

1	Research On Motion Planning In Dynamic Environment Based On Deep Reinforcement Learning
2	Study Of Multi-agent Learning Problem Based On Reinforcement Learning
3	Research On The Pursuit-Evasion Problem With Multi-robot Based On Reinforcement Learning
4	Research Of Path Planning Problem Based On Reinforcement Learning
5	Research On The Sparse Reward Problem Based On Hierarchical Reinforcement Learning
6	The Decomposition And Reconstruction Of Complex Environment In Reinforcement Learning
7	Research On Active SLAM Algorithm Based On Deep Reinforcement Learning In Complex Environment
8	Robot Navigation Algorithm Based On Reinforcement Learning In Unknown Environment
9	Deconfounded Environment Reconstruction For Reinforcement Learning Based Recommendation
10	Research Of Auto-generating Test Paper Based On Reinforcement Learning