| In recent years,tracking control has found wide application in the fields of military,industry,etc.,so the tracking control problem has always been a research hotspot in the control field.The objective of tracking control is to design an optimal controller,which can not only make the output variables of the system track the given reference signal stably,but also minimize the cost function.Along with the development of science and technology,the demand for the performance of control is raised,it is impossible to establish an accurate and effective model.Model-free phenomenon exists widely in most industrial systems.Therefore,it is of great practical significance to study the model-free optimal tracking problem.Adaptive dynamic programming(ADP)has been considered as an effective method to solve model-free optimal control since it was put forward.It integrates adaptive evaluation and design,reinforcement learning and neural network,and can effectively avoid the unknown model in the system when solving the optimal control problem.In this thesis,based on the theory and algorithm of adaptive dynamic programming,the optimal tracking control of discrete affine nonlinear systems and stochastic systems with delay are studied.The specific work is as follows:(1)An iterative globalized dual heuristic programming(GDHP)method is applied to optimal tracking control of a class of discrete affine nonlinear systems.Firstly,in order to further consider the control signal in the varying process,we add a control error difference term to the quadratic cost function.Secondly,the optimal tracking problem is transformed into designing the optimal regulation problem through system transformation,and then GDHP algorithm is introduced to deal with the optimal regulation problem.Finally,the validity of this algorithm is proved by a numerical example.(2)A reinforcement Q-learning method based on value iteration(VI)and ADP algorithm based on heuristic dynamic programming(HDP)structure are proposed for a class of model-free stochastic linear quadratic(SLQ)optimal tracking problem with time delay.Firstly,the delay operator is introduced to construct a new enhanced system consisting of the original system and the command generator.Secondly,The stochastic problem is transformed into a deterministic problem through system transformation.Q-learning algorithm and ADP algorithm are applied to solve the transformation problem,and then the convergence analysis of the two algorithms is given.Finally,the validity of these two algorithms is demonstrated by numerical examples. |