Font Size: a A A

Adaptive critic designs and their applications

Posted on:1998-04-16Degree:Ph.DType:Dissertation
University:Texas Tech UniversityCandidate:Prokhorov, Danil ValentinovichFull Text:PDF
GTID:1468390014974804Subject:Engineering
Abstract/Summary:
This dissertation develops a methodology of adaptive critic designs (ACDs). It is a synthetic development generalizing various previous results in the area of adaptive critics for over twenty years. In the past, ACDs have been primarily used in domains of Markov chains. A methodology developed in this study extends ACDs to Markov sequences.; ACDs are studied in a framework of model reference adaptive control using neural networks. ACD consists of several interacting modules called critic network, action network (controller), identification ne:work, plant (environment), and reference model. Critic, the essential module of ACD, is adapted to evaluate how well action network controls the environment. Action network is trained to optimize outputs of the critic.; There are two main ACDs. One is called an evaluation function ACD, another is a derivative ACD. These differ by the kind of output critic produces. A preferable way of training critic network is established, which is the same for both types of ACDs.; Little attention has been devoted to finding an appropriate quality gauge for critic-based derivatives. A new enabling vision resulted from this study is in recognizing close similarity between derivatives of a form of backpropagation through time and derivatives provided by critics. Moreover, both types of derivatives are proven to be related in a certain way for the case of linear quadratic regulation. This vision helped to clarify several important issues that have remained obscure prior to this study: (1) when to use action dependent derivative critics, (2) how to treat unobservable states, and (3) when to connect a reference model to the critic. In addition, general training procedure applicable to any ACD is proposed and analyzed. It includes a family of new training procedures for derivative ACDs and their hybrids with backpropagation through time.; A new critic-based method of verifying stability of unforced motion is proposed and tested.; All main results and conclusions are verified in several experiments (mostly, computer simulations). Another new feature of this study is an extensive use of a powerful node decoupled extended Kalman filter algorithm for ACD training.
Keywords/Search Tags:Critic, ACD, Adaptive, Acds, Training
Related items