The Design And Implementation Of Point-based Policy Iteration Algorithms For POMDP

Posted on:2014-01-20

Degree:Master

Type:Thesis

Country:China

Candidate:W Li

Full Text:PDF

GTID:2248330395495723

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Planning under uncertainty is increasingly important in research and application field, and so is the robust and scalable algorithm under uncertainty for development of effective autonomous and semi-autonomous systems. Partially Observable Markov Decision Processes (POMDPs) is a framework for optimal decision making in uncertainty environment. However, for the disaster of history and dimension, exact method for POMDP is not practical for real word application. Though policy iteration method is efficient than value iteration, its application in real word also restricted in some domain of small domain.Approximate solutions, especially point-based algorithms make POMDP practical. By collecting important belief points of the space and then planning over those points, point-based algorithms gain approximation of the optimal policy.This thesis introduce the MDP and its generalization to POMDP, and as well the value function and iteration process. Through a brief introduction of the exact solutions, including the policy iteration method and the point-based solutions (PBVI, HSVI, FSVI), we give the analysis of time complexity of them, and conclude the reason for exact methods’ impractical for real world problems. We also details the differences in point collection and backup of them.This thesis comes up with two new algorithms, PIPBVI and PBHSVI and conduct experiments in four benchmark data sets. Through the comparison with other classic point-based algorithms, we analyze and discuss their performance in different problems. PIPBVI gains improvement from PBVI by the pruning of belief sets and vector sets, and the interpolation of points where will improvement the value function. PBHSVI combines the advantages of policy iteration and heuristic searching policy of point-based methods, by using interleaving step of straightforward policy iteration over the state controllers and backup operation over corresponding points, PBHSVI speeds up the convergence without loss of the optimality of policy.

Keywords/Search Tags:

MDP, POMDP, Policy iteration, Point-based algorithm

PDF Full Text Request

Related items

1	The Design And Implementation Of Point-based POMDP Policy Iteration Algorithm
2	The Research And Design Of Approximation Methods For POMDP
3	Research On Point-based Value Iteration Algorithms In POMDP Domains
4	The Research And Design Of Point-based POMDP Value Iteration Algorithm
5	Research On Approximate Algorithms Of POMDP And Application To TCM Therapy Planning
6	Research On Robot Navigation Based On POMDP
7	Point-Based POMDP Solvers: Survey and Comparative Analysis
8	Admission Control Of P2P-based Media Delivery Network
9	Research On Least-Squares Policy Iteration Algorithms
10	Algorithms for partially observable Markov decision processes