Decomposition of large scale dynamic programming by state aggregation

Posted on:2003-07-03

Degree:Ph.D

Type:Dissertation

University:University of Maryland College Park

Candidate:Zhang, Chang

Full Text:PDF

GTID:1468390011485752

Subject:Engineering

Abstract/Summary:

Dynamic Programming (DP) plays an important role in decision making, intelligent control and optimal control. The “curse of dimensionality” is a major obstacle to its application. The purpose of the research in this dissertation is to build a “bridge” over this obstacle.; This dissertation proposes the FHDP (Finite-horizon Hierarchical Dynamic Programming) to solve stochastic shortest path problems. Given a policy, the FHDP can find the expected cost-to-go values in parallel to reduce the computational time to a logarithm level.; For the infinite horizon discounted case, TD(0) based state aggregation algorithm usually suffers the drawbacks of large variation or slow convergence. This dissertation proposes the DCA (Direct Computation based on Aggregation) scheme that works under more general conditions than some other approximate DP algorithms and converges smoothly to the limits. This dissertation also proves that under a given stationary policy, if we select representative states in each macro-state according to the steady state probability distribution, then the value function of the macro-state is equal to or approximately equal to the mean of the value functions of the states within the macro-state for some special partition structures.; State aggregation algorithms usually require that states in each macro-state be able to take different actions. This dissertation proposes two approaches such that the states within the same macro-state can take the same action without causing large errors. Therefore, the number of macro-actions to be evaluated at each macro-state is reduced from |S_k| Ψ to only Ψ, where |S_k| is the number of states in cluster S_k and Ψ is the size of control space of a real state.; Distributed DP algorithm fails when the states are densely coupled with each other, since the amount of information transmitted between each subsystem is prohibitive. This dissertation presents the Distributed Hierarchical DP (DHDP) algorithm and the Distributed Hierarchical Q-learning (DHQ) algorithm. The size of data transmitted from a subsystem to a neighboring subsystem in the communication channel is only 1 unit. The minimum information exchange among subsystems makes the DHDP and DHQ applicable to systems with very limited channel capacities.; This dissertation also proves the convergence of the adaptive soft state aggregation algorithm, multi-level state aggregation algorithm, DCA algorithm for average cost problems, etc. Application of the state aggregation algorithms on network admission control and some other examples are provided.

Keywords/Search Tags:

State aggregation, Programming, Large, /italic

Related items

1	Network state models for analysis and aggregation in large-scale quality of service -aware multiclass networks
2	Research On Large Graph Aggregation Algorithm Based On Finite Memory
3	Design And Implementation Of The Intelligent Agriculture Large Data Aggregation Platform Based On Rule
4	Adaptive dynamic programming for high-dimensional, multicollinear state spaces
5	The Investigation And Exploitation Of Hrmis In The Large Collectivized State-Owned Enterprises
6	Aggregation and time series implications of state-dependent consumption
7	The State Data Aggregation Research In Wireless Sensor Networks
8	Research On Topology Aggregation Algorithm For Large-scale Network Simulation
9	Research On Openflow Based Adaptive Dynamic Bandwidth Aggregation Technology
10	The Research Of Radio Resource Scheduling On Carrier Aggregation