Online Learning Algorithms For Stochastic Linear Quadratic Control And Games

Posted on:2024-08-20

Degree:Master

Type:Thesis

Country:China

Candidate:B Q Zhang

Full Text:PDF

GTID:2568306920950599

Subject:Control engineering

Abstract/Summary:

Traditional control theory generally focuses on the model-based approaches,which has been widely concerned and deeply studied by control scientists.However,with the development of productivity,the complexity of control objects and the requirements for control accuracy are increasing.In various practical applications such as robot control,autonomous driving,and unmanned aerial vehicle control,it is not uncommon to encounter control systems that are too intricate to be precisely modeled,impractical to experiment on,and devoid of any prior knowledge.Model-free control was first proposed more than ten years ago,and related research has sprung up.The development of machine learning provides control researchers with new tools to solve related problems.In this paper,model-free control is studied for two types of systems.For stochastic linear quadratic control,a model-free algorithm for searching the optimal control with unknown dynamics is proposed based on the off-line algorithm and It?’s formula.For nonzero-sum stochastic games,an online Q-learning algorithm based on the actor-critic structure is designed.For the two methods proposed above,we use inverse Lyapunov theorem,stochastic Lyapunov function,and other techniques to study the convergence.The main research conclusions are as follows:(1)For stochastic linear quadratic control,a model-free learning algorithm is designed.Firstly,an off-line value iteration is proposed to solve the Riccati equation,which requires system parameters.Based on this,an online learning algorithm without using system parameters is designed using dynamic programming and It?’s formula.For the proposed algorithms,the convergence is studied,and it is proved that the algorithm is convergent under some mild conditions.(2)For nonzero-sum stochastic linear quadratic games,an online algorithm for solving Nash equilibrium without using system parameters is proposed.First,a Q-function is constructed for each player,and then the Nash equilibrium is solved using an actor-critic structure.For each player,the critic network estimates the Q-function,and the actor network estimates the control.An estimation error is constructed for each neural network,and a turning law for updating the weights of the neural network is found using a gradient descent method.By constructing a stochastic Lyapunov function,it is proved that the game converges to the Nash equilibrium under some mild conditions.

Keywords/Search Tags:

model-free control, reinforcement learning, non-zerosum stochastic linear quadratic games, stochastic linear quadratic control

Related items

1	Study On Linear Quadratic Mean Field Social Control With Unmodeled Dynamics And Multiplicative Noise
2	Design Of Manipulator Based On Model Learning And Linear Quadratic Optimal Control
3	Study On Regular/Irregular Linear Quadratic Optimal Control Problem And Its Applications
4	Quadratic Differential Games Based On The Uncertainty Of Countermeasures Information Structure
5	Output Feedback Control For Nonlinear Quadratic Systems With Incomplete Measurements
6	Research On Some Control Problems For Stochastic Mean Field Systems
7	Research On Design Of Attack Parameter Dependent Controller For Networked Linear Stochastic Control Systems Under Network Attacks
8	Indefinite Linear Quadratic Control Problem Of Stochastic System And Its Applications
9	Discounted-cost Linear Quadratic Regulation Of A Class Of Switched Linear Systems
10	Continuous Singular Systems With Reliable Linear Quadratic Optimal Control