Font Size: a A A

An Approximate Dynamic Programming Approach to Financial Execution for Weapon System Programs

Posted on:2014-09-11Degree:Ph.DType:Thesis
University:George Mason UniversityCandidate:Morman, Erich DFull Text:PDF
GTID:2458390005999947Subject:Operations Research
Abstract/Summary:
During each twelve month fiscal year (FY) cycle weapon system programs across the Department of Defense (DoD) are expected to execute their allocated budgets in an expedient and timely manner. As the FY progresses, a weapon system's cash flow state at any given moment is primarily measured by the cumulative amounts of their budget that are either committed, obligated, accrued, or expended. Regulatory and oversight initiatives such as midyear financial execution reviews and published monthly execution goals serve as measures that are designed to ensure that there is in fact high utilization of a weapon system's allocated yearly budget. The challenge of finding an ideal monthly commitment cash flow policy that achieves a high level of utilization can be expressed as a sequential decision making problem over time. The mathematical area known as Markovian analysis is dedicated to modeling and finding solution methods that focus on such problems with emphasis on understanding how the system moves from state to state throughout the decision process. The complexity of the problem examined in this research stems from the size of the multimillion dollar budgets in question and the numerous projects they fund. In turn, weapon system offices must make hundreds of commitment action determinations over any given fiscal year in an environment of uncertainty. This intricate decision system necessitates that decision makers have good mathematical tools that can assist them with determining an optimal commitment policy.;The research described in this thesis uses approximate dynamic programming (ADP) techniques as a solution method to the financial execution commitment problem for DoD weapon system programs. ADP ideas and concepts are extensions of Markovian analysis principles. The modeling effort uses a simulation based optimization method specifically geared towards solving sequential decision making problems. The more traditional dynamic programming (DP) approaches are variants on the implementation of Bellman's recursive optimality equation. Unfortunately, as a result of the "curse of dimensionality" and the "curse of modeling" these classical methods tend to breakdown when applied within the more complex problem structure scenarios. The ADP approach expands upon the original recursive idea embedded in Bellman's optimality equation and addresses the difficulties associated with the "curse of dimensionality" and the "curse of modeling".;As part of this research, two types of ADP models were built around the use of a post decision state (PDS) variable. The application of the models was tested against a collection of theoretical financial execution project scenarios. The initial model leveraged a Q-learning design, while the second model used value function learning. In each approach, the formulation of an optimal policy was dependent upon three modeling phases. The three phases are referred to as exploration, exploitation (learning), and learnt. The exploration phase of the model relaxes the driving optimality conditions while simulating the execution decision system. The exploitation or learning phase incorporates the optimality conditions within the simulation environment. Lastly, the learnt phase leverages the outputs produced by exploration and exploitation to provide the recommended optimal policy. Additionally, the learnt phase of the models was designed to provide a means for conducting various sensitivity analysis and financial execution drill excursions.;The research resulted in a unique application of ADP as a simulation and problem solving method for generating financial execution commitment policies. The generated ADP polices or commitment plans were compared against an alternative myopic policy approach referred to as a stubby pencil policy. The learnt modeling phase examined and tested the reaction of both the ADP and stubby pencil policies under various expenditure conditions. The analysis showed that the ADP commitment strategy was often either equal or less than that of the myopic stubby pencil strategy. These results suggest that a decision maker following an ADP strategy would either reach full commitment of the budget at a later date or would not reach full commitment of the budget prior to the end of the FY. In the latter case, the remaining uncommitted dollars serve as an indication that improved cash utilization could be obtained by incorporating more work or projects into the budget.
Keywords/Search Tags:Weapon system, Financial execution, Dynamic programming, ADP, Approach, Budget, Commitment, Decision
Related items