Model-based Safe Reinforcement Learning

Posted on:2022-07-10

Degree:Master

Type:Thesis

Country:China

Candidate:L Y Du

Full Text:PDF

GTID:2518306572451344

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

As a learning-based planning method,reinforcement learning(RL)is widely studied by researchers because of its ability of planning without prior knowledge of dynamics.Traditional model-free RL methods suffer from high sample complexity,which means that they have to collect a lot of data to learn the optimal policy.By contrast,model-based RL methods firstly learn a model to approximate the dynamics and then use it to plan actions.In this way,the sample efficiency of RL is improved dramatically.Moreover,to apply RL algorithms in some safety-critical domains,it is crucial to ensure the agent only execute safe actions to guarantee the constraints satisfaction.In general,such problems are formulated as constrained planning problems.However,since the learned model is inaccurate,it is difficult to ensure the safety of actions planned by the learned model.To solve this problem,this paper tries to propose a sample efficient planning method that can ensure safety during the learning process based on model-based RL.The main results of this paper are summarized as follows:(1)Quantifying the uncertainty of the predicted state trajectory.Considering a discrete nonlinear system with bounded noise,this paper utilizes the ensembles of probabilistic neural networks to construct the model to learn the dynamics,which are able to capture and quantify the epsimetic uncertainty and aleatoric uncertainty during the modellearing process.Based on the above model,this paper develops a method to quantify the error bound between the state trajectory predicted by the learned model and the real state trajectory under a give action sequence.(2)Model learning based model predictive control.This paper will exploit model predictive control to solve planning problems with safety constraints.By combining the constraints and the uncertainty of the predicted state trajectory,this paper introduces a tube-based constraint and satisfying the tube-baed constraint means the satisfaction of the original constraints of the real state trajectory.Also,the bumpless constraints are introduced,which cannot be ignored in real-world applications.Based on the above mentioned constraints,a mathematical model for model predictive control is formulated.(3)Robust method for solving constrained optimization problems.Based on crossentropy method,this paper proposes a method to solve the complex constrained optimization problems defined by the tube-based constraints,ensemble of probabilistic neural networks and model predictive control.Besides,a method is developed to address the circumstance that the above constrained optimization problems may not have feasible solutions,which improves the robustness of the proposed method.(4)Safe model-based reinforcement learning algorithm and its evaluation.Based on the above research,this paper proposes a safe model-based RL algorithm to plan high performance actions with constraints satisfaction during the learning process.Then,several constrained navigation experiments of unmanned ground vehicles are designed to evaluate the proposed algorithms in terms of safety,control performance and sample efficiency.

Keywords/Search Tags:

Reinforcement Learning, Safety Constraints, Model Predicitve Control, Uncertainty, Ensembles of Probabilistic Neural Networks

PDF Full Text Request

Related items

1	Design Of Model Predictive Control Algorithms Based On Multi-Step Control Policy For Stochastic Systems With Multiplicative Uncertainty
2	Research And Application Of Reinforcement Learning In Intelligent Safety-critical System
3	The Research On Probabilistic Graphical Models For Uncertainty Processing In Distributed Computing Environments And Related Problems
4	Research On Methods And Application Of Probabilistic Safety Assessment Based On Bayesian Networks
5	Research On Sockt Forecasting System Based On Reinforcement Learning
6	Deep Learning Based Video Inpainting And Reinforcement Learning Methods In Unstable Environment
7	The Research Of Emotional Dialogue Model Based On Recurrent Neural Networks And Reinforcement Learning
8	Research And Applications Of Uncertainty In Bayesian Deep Label Distribution Learning
9	Research On Tracking Control Algorithm Of Underwater Vehicle Based On Reinforcement Learning
10	Study On RBF Neural Networks Adaptive Control