Font Size: a A A

A binary dynamic programming problem with affine transition and reward functions: Properties and algorithm

Posted on:2004-07-17Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Gatica, Ricardo AntonioFull Text:PDF
GTID:1450390011453374Subject:Engineering
Abstract/Summary:
We consider a deterministic, binary-decision, dynamic programming problem with affine transition and reward functions (BATR). We limit our analysis to stationary threshold policies, which we prove to be optimal for the problem. The distinctive feature of our approach to the problem is that, instead of focusing directly on the set of threshold values, we base our analysis on a set J of decision sequences that serve as unique representatives of threshold policies. More specifically, we show that a decision sequence J is in J if and only if it is a unique representative of some threshold policy. This suggests that instead of solving the continuous optimization problem of finding an optimal threshold value, we may solve an equivalent discrete optimization problem of finding an optimal decision sequence in J . Two main results support that conclusion: First, we show that under a lexicographic order , the average reward function w is unimodal on ( J , ). Second, we show that the average reward of every non-periodic sequence can be approximated to any desired level of precision by the average reward of a periodic decision sequence. Therefore the search for an optimal decision sequence can be restricted to the subset of periodic decision sequences in J , which can easily be shown to be countable. Based on these results we develop an approximation scheme for BATR that resembles a binary search over the set K that contains the periods of the periodic sequences in J . Given a value &egr; > 0, the algorithm finds an &egr;-optimal periodic sequence and provides an associated &egr;-optimal threshold value.
Keywords/Search Tags:Problem, Reward, Sequence, Decision, Threshold, Optimal, Periodic
Related items