Font Size: a A A

Apprenticeship learning and reinforcement learning with application to robotic control

Posted on:2009-11-17Degree:Ph.DType:Dissertation
University:Stanford UniversityCandidate:Abbeel, PieterFull Text:PDF
GTID:1448390002992529Subject:Engineering
Abstract/Summary:
Many problems in robotics have unknown, stochastic, high-dimensional, and highly nonlinear dynamics, and offer significant challenges to both traditional control methods and reinforcement learning algorithms. Some of the key difficulties that arise in these problems are: (i) It is often difficult to write down, in closed form, a formal specification of the control task. For example, what is the objective function for "flying well"? (ii) It is often difficult to build a good dynamics model because of both data collection and data modeling challenges (similar to the "exploration problem" in reinforcement learning). (iii) It is often computationally expensive to find closed-loop controllers for high dimensional, stochastic domains.;We describe learning algorithms with formal performance guarantees which show that these problems can be efficiently addressed in the apprenticeship learning setting---the setting when expert demonstrations of the task are available. Our algorithms are guaranteed to return a control policy with performance comparable to the expert's. We evaluate performance on the same task and in the same (typically stochastic, high-dimensional and non-linear) environment as the expert.;Besides having theoretical guarantees, our algorithms have also enabled us to solve some previously unsolved real-world control problems: They have enabled a quadruped robot to traverse challenging, previously unseen terrain. They have significantly extended the state-of-the-art in autonomous helicopter flight. Our helicopter has performed by far the most challenging aerobatic maneuvers performed by any autonomous helicopter to date, including maneuvers such as continuous in-place flips, rolls and tic-tocs, which only exceptional expert human pilots can fly. Our aerobatic flight performance is comparable to that of the best human pilots.
Keywords/Search Tags:Reinforcement learning, Performance
Related items