Font Size: a A A

Model-Free Adaptive Dynamic Programming Algorithm Design Of Continuous-Time Affine Nonlinear Systems

Posted on:2024-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2568306914472514Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence and machine learning method,data-driven methods have been tried to solve problems in a quantity of fields.Learning patterns from data is a way to replace the traditional design method which is based on prior knowledge.From the view of the control field,the adaptive control is able to identify the controlled system from data and adaptively tune the parameters of the controller.Adaptive dynamic programming is an algorithm family that is intensively studied and combines the advantages of adaptive control and optimal control.In this research,the model-free optimal control problem of continuoustime input-affine nonlinear systems is studied.The main contributions are listed as follows:Firstly,for continuous-time input-affine nonlinear systems,a novel online algorithm via synchronous integral reinforcement learning and exploration is proposed.An actor-critic neural network structure is used to solve the integralexploration HJB equation.The networks tune the weights to approximate the optimal policy online.The convergence of the algorithm and the stability of the closed-loop system is proved through Lyapunov analysis.The proposed method is able to learn from online data without the a prior knowledge of the dynamics model and a pre-designed stabilized controller.Secondly,the algorithm is extended to the case that the amplitude of the input signal is constrained.The input-constrained integral-exploration HJB equation and the weight-tuning law of actor-critic networks are established.The proof of convergence and stability is given.Finally,the continuous-time input-affine nonlinear non-zero-sum games are analyzed.The integral-exploration coupled HJ equations are obtained from the Nash equilibrium strategy.Two algorithms that have different update methods are proposed.One of these algorithms has an advantage in convergence speed and the other has smaller spatial complexity.In real-world implementations,the choice of algorithms can be determined by the character of the system and the performance of the hardware.
Keywords/Search Tags:Nonlinear systems, adaptive dynamic programming, opti-mal control, model-free, neural networks
PDF Full Text Request
Related items