Font Size: a A A

Reliable Shortest Path Planning With Time Limits

Posted on:2024-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z HeFull Text:PDF
GTID:2542307079959039Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The problem of reliable shortest path(RSP)in stochastic transportation networks is a hot research topic at present,which is devoted to solving the actual demand of travelers with risk aversion consciousness.This thesis mainly studies SOTA problem which introduces time limits into RSP problem.Two algorithms are proposed from two perspectives of concentration inequality and reinforcement learning.In this thesis,a navigating algorithm based on fourth order moments(FMA)is proposed.The maximum likelihood estimation is used to estimate the first four order moments of the travel time of a given policy,and the tight lower bound of the given policy is calculated by the concentration inequality.Finally,the strategy is improved step by step by the generalized strategy iteration(GPI)to achieve the optimal one.Compared with the classical SOTA algorithms,this algorithm requires less input and does not need a complete travel time distribution.High computational efficiency avoids complex convolution calculation caused by iteration of travel time distribution.The algorithm complexity analysis shows that the calculation amount of FMA is small and is positively correlated with the scale of transportation networks.Experimental results in a series of traffic networks show that the performance of FMA is better than that of the existing technologies.On the other hand,a sample efficient generalized actor critic algorithm(SEGAC)is proposed from the perspective of reinforcement learning.Generalization strategy gradient(GPG)and extended value function estimation(E-VFA)in SEGAC can deal with the nonadditive property of SOTA problem.Then a weighted importance sampling and control variable module is proposed to reduce variance of SEGAC.Finally,a node embedding module can enable SEGAC to generalize solutions to new destinations.SEGAC has the characteristics of high sampling efficiency,small variance and strong generalization,and a pre-trained SEGAC network has efficient online decision-making ability.Experiments on a series of traffic networks show that the performance of SEGAC algorithm is better than the existing technologies and the classical AC algorithm.Generalization evaluation experiments show that SEGAC can output standard performance on new destinations.
Keywords/Search Tags:Reliable Shortest Path, Stochastic On Time Arrival, Concentration Inequality, Reinforcement Learning, Actor Critic
PDF Full Text Request
Related items