Font Size: a A A

Research On Optimal Control Based On The Reinforcement Learning Method

Posted on:2022-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z F XiaoFull Text:PDF
GTID:2518306785951029Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Optimal control is widely used in various fields of production and life,and has a very important research significance and high research value.As the scale and complexity of the controlled system increases in practice,the difficulty of accurate identification of the system is also increased,resulting in higher costs.Therefore,data-driven optimal control methods have captured a great deal of attention.The development of reinforcement learning has provided new ideas and practical approaches in this direction.The main results of this paper are summarized as follows.1.For the multi-player nonzero-sum game problem of linear discrete-time systems,an off-policy game Q-learning algorithm is proposed for the first time by integrating game theory and adaptive dynamic programming under the framework of reinforcement learning.The presented algorithm does not require the parameters of the system model to be known,and the Nash equilibrium solution of the nonzero-sum game for the multi-player can be learned by using the system measurable data only.Moreover,this paper also gives the proofs of biased and unbiased Nash equilibrium solutions of the proposed on-policy game Q-learning algorithm and the off-policy game Q-learning algorithm for learning multi-player nonzero-sum games under the condition that the system adds the probing noise to guarantee persistence of excitation condition,and the convergence proof of the proposed algorithm,respectively.Simulation results verify the effectiveness of the proposed method.2.To solve the H? control problem of linear discrete-time multi-player game systems with external disturbances,an off-policy game Q-learning algorithm for multi-player zero-sum games is proposed under the architecture of adaptive dynamic programming and reinforcement learning by introducing the concept of zero-sum games.Given the obvious advantages of off-policy learning methods over on-policy learning,this paper combines off-policy learning with Q-learning,thereby learning the Nash equilibrium solution of the multi-player zero-sum game with completely unknown system dynamic models and satisfying the disturbance attenuation condition.In addition,the convergence of the proposed algorithm and the unbiasedness of the learning results are rigorously proved.The simulation results verify the effectiveness of the proposed method.3.A novel off-policy cooperative game Q-learning algorithm is proposed for achieving optimal tracking control of linear discrete-time multi-player game systems suffering from exogenous dynamic disturbance.The key strategy,for the first time,is to integrate reinforcement learning,cooperative games with output regulation under the discrete-time sampling framework of multi-player games for achieving data-driven optimal tracking control and disturbance rejection.The coordination equilibrium solution and the steady-state control laws are learned using data by a model-free approach,such that robust optimal control policies have the capability of tolerating disturbance and follow the reference signal via the optimal approach.The convergence of the proposed algorithm and the unbiasedness of the learning results in the presence of probing noise are rigorously demonstrated.Simulation results verify the effectiveness of the proposed method.
Keywords/Search Tags:Reinforcement Learning, Adaptive Dynamic Programming, Game Theory, Output Regulation, Nash Equilibrium, Multi-Player Games
PDF Full Text Request
Related items