Reinforcement Learning-based Data Driven Control For Nonlinear Discrete-time Systems

Posted on:2022-10-25

Degree:Master

Type:Thesis

Country:China

Candidate:M D Lin

Full Text:PDF

GTID:2518306539468864

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the expansion of controlled systems,it is difficult to establish their mechanism mod-els due to the increasing complexity.Thus,traditional model-based control methods have to face the following difficulties,such as low model accuracy and high computational cost,which limit their applications.Thus,how to design control directly through input-output data has become a hot topic in the control field recently.At the same time,with the develop-ment of science and technology,it is possible to collect and store massive amounts of data during industrial processes,where a large amount of redundant data exists.How to improve the efficiency of data utilization is also an urgent problem.In addition,it is inevitable that controlled systems suffer from external disturbance.How to improve the robustness of the controller is also a challenge for the controller design.Based on machine learning theory,re-inforcement learning has self-learning and self-optimization capabilities.It has great potential in solving model-free intelligent control problems.Adaptive dynamic programming combin-ing neural network,dynamic programming and reinforcement learning,avoids the problem of�curse of dimensionality�in traditional dynamic programming,and has a significant appli-cation in controlling complex systems.Therefore,the thesis combines the ideas of adaptive dynamic programming,reinforcement learning,and data-driven control simultaneously.For optimal control with unknown dynamics,data-driven adaptive control methods via reinforce-ment learning are developed,and some attempts are made to apply the developed methods to implementations.The main contents of this paper are described as follows.(1)For optimal tracking control problems,an off-policy model-free optimal tracking con-trol method is proposed.First,for nonlinear discrete-time systems,the optimal tracking con-trol problem is converted into an optimal regulation problem through system transformation.Then,the policy gradient is utilized to update the control policy directly without requiring the system model.Second,the actor-critic framework is constructed to approximate the iterative Q-function and improve the control policy.With the theoretical analysis,the convergence of the iterative algorithm is guaranteed,and the weight estimation errors of two neural networks are ultimately uniformly bounded.It is worth mentioning that the proposed method is realized in a model-free manner,which improves its expansibility.(2)For unknown nonlinear discrete-time systems subject to external disturbances,a pol-icy optimization control method for model-free zero-sum game is developed.The problem is regarded as a two-player zero-sum game,and the optimal solution,namely Nash equilibrium,is obtained by solving the nonlinear Hamilton-Jacobi-Isaacs(HJI)equation.To overcome the difficulty in obtaining its analytical solution,a policy optimization control method is proposed.First,an iterative Q-function that depends on the system state,the control player and the dis-turbance player is defined.Then,the policy gradient is utilized to update the control policy and the disturbance policy,respectively.The monotonicity and convergence of the iterative pro-cess algorithm are established.To implement the algorithm,a critic-actor-disturbance neural network architecture is constructed.By employing the experience replay technique,the using efficiency of data is improved and the stability of learning process is guaranteed.In summary,this paper mainly studies data-driven optimal control method using rein-forcement learning to solve the optimal tracking control problems of nonlinear discrete-time systems,and two-player zero-sum game problems with unknown system dynamics.The re-search involves modern control theory,neural networks,adaptive dynamic programming and reinforcement learning,which aims to develop intelligent control methods to solve learning control problems and apply them to the control of complex systems.

Keywords/Search Tags:

Adaptive daynamic programming, Reinforcment learning, Neural networks, Data-driven, Zero-sum game, Experience replay, Policy gradient

PDF Full Text Request

Related items

1	Research And Implementation On Game Control Algorithm Based On Deepening Reinforcement Learning
2	Improvement And Application Of Deep Reinforcement Learning Based On Experience Replay Mechanism
3	Research On Learning Of The Optimal Policy In Largescale State Space
4	Research On Optimization Methods Of The Experience Replay Mechanism For Off-policy Reinforcement Learning
5	Research On Experience Replay Method For Deep Reinforcement Learning
6	Research Of Game Intelligence Based On Improved Policy Gradient Method
7	Research On Game Algorithm Based On Fictitious Self-play With Prioritized Experience Replay
8	Research On Fast Policy Gradient Algorithms Of Reinforcement Learning Based On Adaptive Learning Rate
9	Study Of Robot Arm Control Based On Deep Reinforcement Learning
10	Research On Optimization Method Of Deep Reinforcement Learning Experience Replay