Font Size: a A A

Stochastic Discrete Linear Quadratic Optimal Control Problem Based On ADP Algorithm

Posted on:2019-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:R R LiuFull Text:PDF
GTID:2370330578472924Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The industrial production process becomes more complex with the continuous development of modern science and technology,it is very difficult for us to describe the dynamic characteristics of the system with an accurate mathematical model,influenced by uncertainties,strong nonlinearity,and multivariable factors,which caused great interference to the solution of the optimal index function.Therefore,the optimal control problem for unknown dynamic systems has become a hot research topic.This thesis adopts adaptive dynamic programming(ADP)method based on neural network to solve infinite time linear quadratic(LQ)optimal control problem of unknown stochastic discrete systems.The researches are as follows:1.The infinite time LQ optimal control problem with unknown mean-field stochastic discrete system is studied.Firstly,the original Riccati equation is extended to the generalized Riccati equations,the state feedback gain matrix is expanded to gain matrix pairs,and sufficient conditions for existence of LQ optimal control of mean field type are given.Next,the stochastic system is transformed into a deterministic system,and a value iterative ADP algorithm is proposed and the convergence analysis is carried out.Meanwhile,we use back propagation(BP)neural network to design the model network,critic network and action network to estimate the unknown system model,the objective function and the control strategy,respectively.Finally,the effectiveness of the ADP method is verified by system simulation.2.The infinite time optimal strategy problem for unknown stochastic discrete Stackelberg game system is solved.At first,the stochastic system is transformed into a deterministic svstem,and the existence of the policy set is given.Secondly,Under the condition of satisfying Nash equilibrium,the ADP algorithm is applied to construct the iterated function.The interaction between the decision-maker and the followers should be considered when constructing the iterative equations,and the structural characteristics should be consistent between objective functions and iterative equations as well as the number of iterative equations is related to the number of decision makers and followers.Meanwhile,the convergence proof is also discussed.Next.we use BP neural network to design heuristic dynamic programming(HDP)controller,and estimate dynamic system state,objective function and optimal policy set by training model network,critic network and action network,respectively.Finally,we verify the effectiveness of the algorithm by simulation experiments.
Keywords/Search Tags:Linear-quadratic optimal control, Adaptive dynamic programming, Back propagation neural network, mean-field system, Stackelberg game stochastic discrete-time system
PDF Full Text Request
Related items