Font Size: a A A

The Study And Application Of Distributional Reinforcement Learning Based Reliable Decision Making Methods

Posted on:2024-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:W D ShengFull Text:PDF
GTID:2530307079459034Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Reliable decision making is one of the key factors to ensure the stable operation of various automated systems.This thesis mainly studies two problems in the field of reliable decision making: reliable shortest path(RSP)and multi-robot reliable search(MuRRS).The RSP problem is path planning in a stochastic road network,which considers not only travel time but also reliability(such as on-time arrival probability,etc.),which can better meet user needs under complex road conditions.Multi-robot search is a basic link in applications such as robot search and rescue and emergency response.Multi-robot search problem in an enclosed environment where a group of robots search for a moving random target and minimize expected search time.This problem is also called multi-robot efficiency search problem(Mu RES).However,in real application scenarios,such as search and rescue tasks,it is more important to increase the probability of finding the target within a limited time.Therefore,the thesis adds consideration of reliability to Mu RES problem and proposes MuRRS problem.To solve these two problems,this thesis conducts research based on Distributional reinforcement learning(DRL),which can fit complete probability distributions for travel time or target searching time and obtain stochastic information from them according to different definitions of reliability for decision-making purposes.This thesis proposes two solutions based on DRL:(1)RSP problem: This thesis proposes a DRL method that combines graph embedding and deep learning(GE-DDRL).GE-DDRL remaps RSP problem to Markov decision process(MDP).This method can effectively compress the state space and approximate the travel time probability distribution to achieve reliable decision-making.This thesis also proposes LET warm-start,pruning,adaptive double -greedy strategy to accelerate GEDDRL training process.After training,GE-DDRL can effectively solve RSP problem in large-scale stochastic road networks.(2)MuRRS problem: This thesis transforms it into a decentralized partially observable Markov decision process(Dec-POMDP),and uses a recurrent neural network(GRU)combined with other network layers to approximate the individual target finding time probability distribution,and combines multi-agent reinforcement learning(MARL)methods to propose a solution(PD-FAC).Then it proposes the universal individual global max theory,which guarantees that PD-FAC can complete MuRRS task under the centralized training decentralized execution(CTDE)MARL training framework.PD-FAC can handle MuRRS problems under different reliability definitions,each robot makes independent decisions,more suitable for real application scenarios.Finally,this thesis builds open source experimental platforms for both RSP and MuRRS respectively,and verifies GE-DDRL and PD-FAC,and compares them with advanced research results in their respective fields.The results show that GE-DDRL and PD-FAC can achieve excellent results in various situations.
Keywords/Search Tags:Reliable decision making, reliable shortest path planning, multi-robot reliable search, distributional reinforcement learning, multi-agent reinforcement learning
PDF Full Text Request
Related items