Simulation-based algorithms for Markov decision processes

Posted on:2003-11-14

Degree:Ph.D

Type:Dissertation

University:University of Maryland College Park

Candidate:He, Ying

Full Text:PDF

GTID:1468390011479518

Subject:Engineering

Abstract/Summary:

Problems of sequential decision making under uncertainty are common in manufacturing, computer and communication systems, and many such problems can be formulated as Markov Decision Processes (MDPs). Motivated by a capacity expansion and allocation problem in semiconductor manufacturing, we formulate a fab-level decision making problem using a finite-horizon transient MDP model that can integrate life cycle dynamics of the fab and provide a trade-off between immediate and future benefits and costs.; However, for large and complicated systems formulated as MDPs, the classical methodology to compute optimal policies, dynamic programming, suffers from the so-called “curse of dimensionality” (computational requirement increases exponentially with number of states/controls) and “curse of modeling” (an explicit model for the cost structure and/or the transition probabilities is not available).; In problem settings to which our approaches apply, instead of the explicit transition probabilities, outputs are available from either a simulation model or from the actual system. Our methodology is first to find the structure of optimal policies for some special cases, and then to use the structure to construct parameterized heuristic policies for more general cases and implement simulation-based algorithms to determine parameters of the heuristic policies. For the fab-level decision-making problem, we analyze the structure of the optimal policy for a special “one-machine, two-product” case, and discuss the applicability of simulation-based algorithms.; We develop several simulation-based algorithms for MDPs to overcome the difficulties of “curse of dimensionality” and “curse of modeling”, considering both theoretical and practical issues. First, we develop a simulation-based policy iteration algorithm for average cost problems under a unichain assumption, relaxing the common recurrent state assumption. Second, for weighted cost problems, we develop a new two-timescale simulation-based gradient algorithms based on perturbation analysis, provide a theoretical convergence proof, and compare it with two recently proposed simulation-based gradient algorithms. Third, we propose two new Simultaneous Perturbation Stochastic Approximation (SPSA) algorithms for weighted cost problems and verify their effectiveness via simulation; then, we consider a general SPSA algorithm for function minimization and show its convergence under a weaker assumption: the function does not have to be differentiable.

Keywords/Search Tags:

Simulation-based algorithms, Decision, Problem

Related items

1	On Algorithms For LLTS Ready Simulation
2	For Chinese Type Of Decision-making Problem Of Intelligent Recognition Method Research
3	Research On Agent Decision Problem Based On Markov Decision Process Theory
4	Electrical Impedance Imaging Algorithms And Matlab Simulation
5	Complexity analysis and optimal algorithms for decentralized decision making
6	Application Of Genetic Algorithms In Logistics System Optimization
7	High-performance algorithms for the simulation, estimation and control of non-linear systems (Spanish text)
8	Improved Monotonic Decision Trees Algorithms
9	Research On Solving Covering Problem And Knapsack Problem Based On Evolutionary Algorithms
10	Research On Encoding/Decoding Technolgy Of LDPC Codes And Design Of Simulation Platform