Font Size: a A A

Research And Application Of Dependable Reinforcement Learning Based On Timed Differential Dynamic Logic

Posted on:2022-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:R H WangFull Text:PDF
GTID:2518306722471874Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Thanks to the development of deep neural networks in recent years,reinforce-ment learning,especially deep reinforcement learning which is combined with deep neural networks,has attracted many attention due to its high efficiency and has been widely used in various fields.However,related methods are difficult to use alone in safety-critical systems because of their unknowns,especially deep reinforcement learn-ing using deep neural networks,and the black-box characteristics of the network make it impossible to guarantee the security of the system.On the other hand,since most of the safety-critical systems are hybrid systems,this makes manual design and development of certain core parts very complicated.If reinforcement learning methods can be intro-duced,the efficiency of the project can be greatly improved.Dependable reinforcement learning is a kind of approach that can not only guarantee the security of the system,but also make full use of the efficiency brought by the reinforcement learning algorithm.How to introduce dependable reinforcement learning into safety-critical systems is a key challenge.There are already some methods to introduce reinforcement learning into safety-critical systems,including using constraints,exploring part of the state space,etc.How-ever,these methods can only reduce risks,but cannot completely guarantee safety.A better and more modern method is to combine reinforcement learning with formal method based on rigorous mathematical theories to ensure the security of the system.However,the existing such methods still have shortcomings,which will reduce the ef-ficiency when dealing with ever-changing and complex environment.The main contributions of this paper are as follows:(1)Propose Timed Differential Dynamic Logic to express the nature of the system.This logic extends Differential Dynamic Logic from the perspective of time.Compared with the original logic,Timed differential dynamic logic is a more flexible logic,espe-cially for solving problems in communication-based systems.(2)Propose a dependable learning framework based on Timed differential dy-namic logic.This framework can transform the system expression described by Timed differential dynamic logic into a runtime monitor,and at the same time,it can be used in conjunction with the Dependable Mixed Control algorithm proposed in this paper to ensure the safety and efficiency of the system.(3)Designed a Communication-Based Autonomous Control system model,and based on this,designed an experiment for the dependable learning framework proposed in this paper to verify its feasibility and effectiveness.The significance of this paper is that we propose a new logic to describe the hybrid system.At the same time,we use the algorithm proposed in this paper and some existing tools to establish a dependable reinforcement learning framework that can be applied to safety-critical hybrid systems.The efficiency has broken through the state of art algorithm.
Keywords/Search Tags:Reinforcement Learning, Timed Differential Dynamic Logic, Hybrid System, Runtime Monitor, Safe Control
PDF Full Text Request
Related items