Font Size: a A A

Integrated Adaptive Dynamic Programming For Intelligent Control Of Discrete-Time Dynamical Systems

Posted on:2024-08-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:M M HaFull Text:PDF
GTID:1528306914474294Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent decades,reinforcement learning has enjoyed quite remarkable successes for a wide range of fields.Various reinforcement learning methods have been developed to overcome various difficulties derived from different decision-making tasks.One class of reinforcement learning methods are built upon the actor-critic structure,namely adaptive critic designs,which have been widely applied to the intelligent control field.In the vast majority of control systems,since the state and control input spaces are continuous,it is necessary to introduce the function approximator into the critic structure to estimate the values of states and actions.Therefore,the combination of dynamic programming,function approximation techniques,and the actor-critic structure results in adaptive dynamic programming.The traditional iterative adaptive dynamic programming algorithms mainly involve value iteration and policy iteration.The admissibility of iterative control policies generated by policy iteration can be guaranteed while the admissibility of the iterative control policies derived from value iteration is unknown.In addition,the effect of the discount factor on the stability of the closed-loop systems is also unknown.On the other hand,it is unclear whether there is an iterative adaptive dynamic programming scheme with a faster convergence rate.For the stability and convergence rate problems of iterative adaptive dynamic programming,this thesis focuses on the existing problems and challenges,and establishes integrated adaptive dynamic programming methods with stability,approximation accuracy,and fast convergence guarantee.For the iterative adaptive dynamic programming based discrete-time nonlinear optimal control problem,a comprehensive admissibility analysis of the iterative control policies is provided.Then,the effect of the discount factor on the stability of the closed-loop systems is discussed.Inspired by the successive relaxation method,a novel iterative adaptive dynamic programming framework with adjustable convergence rate is developed.Finally,a novel stability analysis method is developed to guarantee that the tracking errors are eliminated completely.Combining the novel adaptive dynamic programming scheme and the new performance index function,the accelerated learning design is extended to the optimal tracking control problem.The main research content of this dissertation are summarized as follows:1)For the iterative control policies derived from the traditional value iteration,the stability and attraction domain criteria are given.Besides,the theoretical results reveal that,in the traditional value iteration,the admissible control policy can be obtained within finite iterations.Based on the developed theoretical foundations,a class of integrated iterative adaptive dynamic programming methods are established with stability guarantee and the stability of the closed-loop systems using different evolving iterative control policies is discussed in detail.2)For the iterative adaptive dynamic programming with the discount factor,the effect of the discount factor on the stability of the iterative control policies is investigated and some stability criteria are provided.For the unknown system functions,the neural network is employed to model the system dynamics.The uniform ultimate boundedness stability of the state estimation error,and the weight and bias estimation errors is elaborated,where the weights and biases across all layers are updated.In addition,an integrated value iteration algorithm with accuracy guarantee is developed.3)For the convergence rate of the iterative value function sequence,a novel discounted iterative adaptive dynamic programming framework is developed,where the iterative value function sequence possesses an adjustable convergence rate.In the novel value iteration algorithm,the different convergence properties and the positive definiteness of the new value function sequence,and the admissibility of the obtained iterative control policies are investigated.The theoretical results demonstrate that the present value iteration possesses a faster convergence rate than the traditional value iteration.Based on the proposed convergence properties,three accelerated learning algorithms are designed,which have faster convergence rate and need less computational cost than the traditional value iteration.4)For the iterative adaptive dynamic programming based tracking control,a new performance index function is introduced to eliminate completely the tracking error and a new stability analysis method is developed.Considering the present novel iterative adaptive dynamic programming framework,the new value iteration based tracking control algorithm also possesses accelerated learning capability.
Keywords/Search Tags:Reinforcement learning, adaptive dynamic programming, optimal control, tracking control, discrete-time nonlinear systems, integrated iteration
PDF Full Text Request
Related items