Font Size: a A A

Robust dynamic programming for Markov decision processes and filtering applications to video target tracking

Posted on:2010-04-21Degree:Ph.DType:Dissertation
University:Arizona State UniversityCandidate:Li, BaohuaFull Text:PDF
GTID:1448390002479313Subject:Engineering
Abstract/Summary:
This dissertation addresses robust dynamic programming for finite-state, finite-action, discounted infinite-horizon Markov decision processes (MDPs) with uncertain transition matrices in deterministic policy spaces and robust recursive least-squares (RLS) filtering for stochastic liner dynamic systems with applications to video target tracking.;For robust RLS filtering and video target tracking, a fast video target tracking system is developed to obtain a target position at each frame. In the system, a RLS filter is used to provide estimation and prediction of the target position and velocity. When video quality is degraded by noise, the sensitivity of a RLS filter to variation in affine transform and disturbances in observation of the target position may lead to noticeable deviations from the true target position sufficient to lose the target. Therefore, an interval RLS filter is developed. It produces estimation and prediction in narrow intervals of the target position and velocity. The interval filter based video target tracking system is more robust to noise in video sequences. Performance of both video target tracking systems in real world video sequences demonstrates their effectiveness.;For robust dynamic programming to solve problems of MDPs with uncertain transition matrices, firstly, a, generalized robust optimality criterion is developed to guarantee existence of an optimal or near-optimal policy for any uncertain, independent or correlated transition matrix. Theorems are proved to guarantee a stationary policy being optimal or near-optimal. Based on the proposed optimality criterion, approximate robust policy iterations using successive approximation and multilayer perceptron neural networks are developed. It is proved that the two algorithms converge to a stationary optimal or near-optimal policy surely and in a probability sense, respectively. In addition, a belief function model is developed to obtain an optimal tight set estimate of a transition matrix at a high confidence level. This model can be used when observation samples on state transitions are small or large, prior probability distributions are precise or imprecise, variations of transition probability rows are independent or correlated.
Keywords/Search Tags:Robust dynamic programming, Video target tracking, Transition, RLS filter
Related items