Font Size: a A A

A Bayesian Framework for Online Parameter Learning in POMDPs

Posted on:2012-08-21Degree:Ph.DType:Thesis
University:McGill University (Canada)Candidate:Atrash, AminFull Text:PDF
GTID:2468390011469223Subject:Computer Science
Abstract/Summary:
Decision-making under uncertainty has become critical as autonomous and semi-autonomous agents become more ubiquitious in our society. These agents must deal with uncertainty and ambiguity from the environment and still perform desired tasks robustly. Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for modelling agents operating in such an environment. These models are able to capture the uncertainty from noisy sensors, inaccurate actuators, and perform decision-making in light of the agent's incomplete knowledge of the world. POMDPs have been applied successfully in domains ranging from robotics to dialogue management to medical systems.;Extensive research has been conducted on methods for optimizing policies for POMDPs. However, these methods typically assume a model of the environment is known. This thesis presents a Bayesian reinforcement learning framework for learning POMDP parameters during execution. This framework takes advantage of agents which work alongside an operator who can provide optimal policy information to help direct the learning. By using Bayesian reinforcement learning, the agent can perform learning concurrently with execution, incorporate incoming data immediately, and take advantage of prior knowledge of the world. By using such a framework, an agent is able to adapt its policy to that of the operator.;This framework is validated on data collected from the interaction manager of an autonomous wheelchair. The interaction manager acts as an intelligent interface between the user and the robot, allowing the user to issue high-level commands through natural interface such as speech. This interaction manager is controlled using a POMDP and acts as a rich scenario for learning in which the agent must adjust to the needs of the user over time.
Keywords/Search Tags:Framework, Agent, Bayesian, Pomdps
Related items