Font Size: a A A

Online Learning Algorithms For Stochastic Linear Quadratic Control And Games

Posted on:2024-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:B Q ZhangFull Text:PDF
GTID:2568306920950599Subject:Control engineering
Abstract/Summary:
Traditional control theory generally focuses on the model-based approaches,which has been widely concerned and deeply studied by control scientists.However,with the development of productivity,the complexity of control objects and the requirements for control accuracy are increasing.In various practical applications such as robot control,autonomous driving,and unmanned aerial vehicle control,it is not uncommon to encounter control systems that are too intricate to be precisely modeled,impractical to experiment on,and devoid of any prior knowledge.Model-free control was first proposed more than ten years ago,and related research has sprung up.The development of machine learning provides control researchers with new tools to solve related problems.In this paper,model-free control is studied for two types of systems.For stochastic linear quadratic control,a model-free algorithm for searching the optimal control with unknown dynamics is proposed based on the off-line algorithm and It?’s formula.For nonzero-sum stochastic games,an online Q-learning algorithm based on the actor-critic structure is designed.For the two methods proposed above,we use inverse Lyapunov theorem,stochastic Lyapunov function,and other techniques to study the convergence.The main research conclusions are as follows:(1)For stochastic linear quadratic control,a model-free learning algorithm is designed.Firstly,an off-line value iteration is proposed to solve the Riccati equation,which requires system parameters.Based on this,an online learning algorithm without using system parameters is designed using dynamic programming and It?’s formula.For the proposed algorithms,the convergence is studied,and it is proved that the algorithm is convergent under some mild conditions.(2)For nonzero-sum stochastic linear quadratic games,an online algorithm for solving Nash equilibrium without using system parameters is proposed.First,a Q-function is constructed for each player,and then the Nash equilibrium is solved using an actor-critic structure.For each player,the critic network estimates the Q-function,and the actor network estimates the control.An estimation error is constructed for each neural network,and a turning law for updating the weights of the neural network is found using a gradient descent method.By constructing a stochastic Lyapunov function,it is proved that the game converges to the Nash equilibrium under some mild conditions.
Keywords/Search Tags:model-free control, reinforcement learning, non-zerosum stochastic linear quadratic games, stochastic linear quadratic control
Related items