Font Size: a A A

Adaptive Dynamic Programming Theory On Optimal Control Scheme For Several Classes Of Nonlinear Time-delay Systems

Posted on:2012-07-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:R Z SongFull Text:PDF
GTID:1228330467481082Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Due to the analytical optimal solution of nonlinear time-delay systems is hard to be obtained, the optimal control problem of nonlinear time-delay systems is the principal and difficult domain. On the other hand, adaptive dynamic programming (ADP), as a new algorithm to approximately solve the optimal control problem, it has been regarded as an effective way to deal with the optimal control problem of nonlinear systems. Approximate dynamic programming, combining with neural networks, adaptive critic technique, reinforcement learning and dynamic program-ming theory, obtains the optimal control without the "curse of dimensionality" and then receives lots of attentions. So, it is of great importance on nonlinear time-delay optimal control for the further research on the theory and algorithm of approximate dynamic programming. In this dissertation, based on approximate dynamic pro-gramming, some optimal control problems are investigated on nonlinear time-delay systems such as optimal stabilized control, optimal tracking control, finite-horizon optimal control, etc. The main research of the dissertation can be briefly described as follows:1. A new iterative Heuristic Dynamic Programming (HDP) algorithm is pro-posed to solve the optimal control problem for a class of nonlinear discrete time-delay systems with saturating actuators. Considering the saturation nonlinearity in the actuators, a nonquadratic performance index function is introduced. In the mean-time, a state modification is used to deal with the obstacle induced by time delays. In the new iterative HDP algorithm the local and global optimization searching pro-cesses are developed to solve the optimal feedback control problem with convergence analysis. In the presented iterative HDP algorithm, two neural networks are used to facilitate the implementation of the iterative algorithm. Convergence analysis is presented to prove the performance index function to reach the optimum by the proposed method.2. For the first time, a Heuristic Dynamic Programming (HDP) iteration al-gorithm is proposed to solve the optimal tracking control problem for a class of nonlinear discrete-time systems with time delays. The novel algorithm contains state updating, control policy iteration and performance index iteration. To get the optimal states, the states are also updated. Furthermore, the "backward iteration" is applied to state updating. Two neural networks are used to approximate the performance index function and compute the optimal control policy for facilitating the implementation of HDP iteration algorithm. Rigorous convergence analysis is presented to prove the performance index function to reach the optimum by the proposed method.3. An iteration Dual Heuristic dynamic Programming (DHP) algorithm for a class of discrete time-delay systems is proposed. The method is based on Echo State Networks (ESNs), and has universal approximation capability. The conver-gence analysis of iteration DHP algorithm is given. Base on these works, the ESNs architecture is used to be as the approximator of the costate function for each iter-ation. To ensure the reliability of the ESNs approximator, the ESNs mean square training error is guaranteed to be in satisfactory range.4. A new iteration algorithm is proposed to solve the finite horizon optimal control problem for a class of time-delay affine nonlinear systems with known sys-tem dynamic. First we prove that the algorithm is convergent as the iteration step increasing. Then, a theorem is presented to demonstrate that the limit of the it-eration performance index function satisfies discrete-time Hamilton-Jacobi-Bellman (DTHJB) equation, and the finite-horizon iteration algorithm is present with satis-factory accuracy error.5. An adaptive dynamic programming (ADP) algorithm is proposed to solve the nearly optimal finite-horizon control problem for a class of nonaffine nonlinear time-delay systems. In this algorithm, two cases are considered with two different initial iteration situations. The state updating is used to determine the optimal state. The convergence is verified by theorems. Furthermore, the relationship of iteration steps and time steps is given. Convergence analysis is presented to prove the performance index function to reach the optimum by the proposed method.6. The time-invariant trajectory tracking control problem under N-step control is solved by finite horizon approximate dynamic programming (ADP) algorithms. At first, we convert the tracking control problem for time-invariant trajectory into an output regulation problem. The cost function guarantees the energy is mini-mum. Secondly, the regulation control scheme is proposed using finite horizon ADP technique to obtain the N-step control. Then two theorems are used to prove the convergence of the proposed control algorithm.
Keywords/Search Tags:adaptive dynamic programming, adaptive critic designs (ACDs)reinforcement learning (RL), nonlinear system, neural network, ESNs, time-delaysystems, tracking control
PDF Full Text Request
Related items