Font Size: a A A

Optimal learning strategies for multi-attribute resource allocation problem

Posted on:2006-10-04Degree:Ph.DType:Dissertation
University:Princeton UniversityCandidate:George, AbrahamFull Text:PDF
GTID:1458390008976839Subject:Operations Research
Abstract/Summary:
We address the problem of optimal learning in approximate dynamic programming algorithms for solving discrete, multi-attribute resource allocation problems. Traditional approaches divide the problem into time stages and compute the optimal decisions for each time period using value functions that capture the information regarding the future state of the system. The curses of dimensionality, typical of such problems, force us to use approximate dynamic programming where we step forward in time and compute approximations of value functions of only the states that are actually visited. The challenges involved include estimating the value of a multi-attribute resource where the attributes could be categorical in nature and handling an extremely large number of attribute vectors, which make it hard to form reliable value estimates with limited information.;In order to overcome the problem of large state spaces, we could represent the resources at different levels of aggregation, producing the classic tradeoff between statistical and structural error. We propose a method which estimates the value of a resource as a weighted combination of values at different levels of aggregation. We derive the optimal weights which minimize the expected error, taking into account that the values are correlated and propose a few heuristic weighting schemes that seem promising. We demonstrate experimentally that our aggregation techniques produce significant improvements in solution quality, first for a single-asset allocation problem which can be solved to optimality and later in a dynamic fleet management problem with multiple asset classes.;The observations of value function estimates in approximate dynamic programming typically come from a data series that can be initially highly transient. The degree of transience affects the choice of stepsize parameters that produce the fastest convergence and can vary widely among the value function parameters for the same dynamic program. Practical computational work tends to use formulas with parameters that have to be tuned for specific applications. We derive an optimal stepsize formula that minimizes estimation error. Experimental work in several different problem settings shows that an approximation of this rule provides faster convergence, without requiring any tuning of parameters, than other popular formulas.
Keywords/Search Tags:Multi-attribute resource, Problem, Optimal, Approximate dynamic programming, Allocation, Parameters
Related items