Font Size: a A A

Using reinforcement learning to improve network durability

Posted on:2014-10-13Degree:Ph.DType:Dissertation
University:Rensselaer Polytechnic InstituteCandidate:Hammel, ErikFull Text:PDF
GTID:1458390005987597Subject:Applied Mathematics
Abstract/Summary:
Our goal is to determine and optimize the efficacy of reinforcing an existing flow network to prevent unmet demand from imminent disruptions. We are given probabilities of failures for edges in the network and are asked to find edges which will best provide durability to the network post-event. The problem is extended to multiple time steps to address concerns of available resources versus quality of installations: the farther away from the event one makes decisions the more resources are available but the less reliable the uncertainty information. This sequential decision-making process is a classic example of dynamic programming. To avoid the "curses of dimensionality", we formulate an approximate dynamic program. To improve performance, especially as applied to flow networks, we derive several innovative adaptations from reinforcement learning concepts. This involves developing a policy, a function that makes installation decisions when given current forecast information, in a two step process: policy evaluation and policy improvement.;The primary solution technique takes forecast samples from a Monte Carlo simulation in the style of stochastic programming. Once a forecast is obtained, the problem is set up by taking additional samples of the forecast probabilities to determine capacities for the given time step. This forms the state information used in performing the approximate dynamic program. The sampled outcome information is used to define network constraints for the policy improvement step. The approximation for future costs is then refined using the improved policy compared with a desirable target objective. This process is repeated over several iterations. Lastly, we provide empirical evidence which corroborates with basic theorems of convergence for more simplistic forms of the reinforcement learning process.;With a trained policy, we compare its performance against traditional two-stage stochastic programs with recourse utilizing a sample average approximation model. We consider several implementations of the stochastic problem to gauge performance in a variety of ways. The material presented here is developed in the context of preparing urban infrastructures against damages caused by disasters, however is applicable to any flow network. This paper contributes to both the field of multistage stochastic programming and approximate dynamic programming by introducing factors to each other. We also apply innovative reinforcement learning techniques to flow networks that, as of this writing, have yet to be addressed.
Keywords/Search Tags:Network, Reinforcement learning, Flow
Related items