Font Size: a A A

Value function based control for complex systems

Posted on:2010-12-21Degree:M.ScType:Thesis
University:University of Alberta (Canada)Candidate:Nosair, HussamFull Text:PDF
GTID:2448390002987833Subject:Engineering
Abstract/Summary:
We present a novel approach based on approximate dynamic programming (ADP) that is suitable for the control of large-scale systems with complex dynamics. The proposed approach improves closed-loop performance from a starting policy by incrementally updating a value function on-line under the Bellman's optimality equation. Function approximation is used to approximate the value function in order to circumvent "the curse of dimensionality." Temporal Difference (TD) learning is an emerging technique in the field of Reinforcement Learning (RL), which makes it possible to improve closed-loop performance by updating the value function on-line. The proposed approach uses piecewise parametric quadratic function approximation and is deployed on-line with TD-learning in order to circumvent "the curse of dimensionality," further improve the value function, and control the system while learning. The proposed approach has considerable advantage over model predictive control (MPC) and its alternatives such as instance-based and neural network-based ADP methods. Case studies are presented throughout the thesis in order to show the applicability and the advantage of using the proposed approach.
Keywords/Search Tags:Value function, Approach
Related items