Font Size: a A A

Decomposition of large scale dynamic programming by state aggregation

Posted on:2003-07-03Degree:Ph.DType:Dissertation
University:University of Maryland College ParkCandidate:Zhang, ChangFull Text:PDF
GTID:1468390011485752Subject:Engineering
Abstract/Summary:
Dynamic Programming (DP) plays an important role in decision making, intelligent control and optimal control. The “curse of dimensionality” is a major obstacle to its application. The purpose of the research in this dissertation is to build a “bridge” over this obstacle.; This dissertation proposes the FHDP (Finite-horizon Hierarchical Dynamic Programming) to solve stochastic shortest path problems. Given a policy, the FHDP can find the expected cost-to-go values in parallel to reduce the computational time to a logarithm level.; For the infinite horizon discounted case, TD(0) based state aggregation algorithm usually suffers the drawbacks of large variation or slow convergence. This dissertation proposes the DCA (Direct Computation based on Aggregation) scheme that works under more general conditions than some other approximate DP algorithms and converges smoothly to the limits. This dissertation also proves that under a given stationary policy, if we select representative states in each macro-state according to the steady state probability distribution, then the value function of the macro-state is equal to or approximately equal to the mean of the value functions of the states within the macro-state for some special partition structures.; State aggregation algorithms usually require that states in each macro-state be able to take different actions. This dissertation proposes two approaches such that the states within the same macro-state can take the same action without causing large errors. Therefore, the number of macro-actions to be evaluated at each macro-state is reduced from |Sk| Ψ to only Ψ, where |Sk| is the number of states in cluster Sk and Ψ is the size of control space of a real state.; Distributed DP algorithm fails when the states are densely coupled with each other, since the amount of information transmitted between each subsystem is prohibitive. This dissertation presents the Distributed Hierarchical DP (DHDP) algorithm and the Distributed Hierarchical Q-learning (DHQ) algorithm. The size of data transmitted from a subsystem to a neighboring subsystem in the communication channel is only 1 unit. The minimum information exchange among subsystems makes the DHDP and DHQ applicable to systems with very limited channel capacities.; This dissertation also proves the convergence of the adaptive soft state aggregation algorithm, multi-level state aggregation algorithm, DCA algorithm for average cost problems, etc. Application of the state aggregation algorithms on network admission control and some other examples are provided.
Keywords/Search Tags:State aggregation, Programming, Large, /italic
Related items