Reliable Shortest Path Planning With Time Limits

Posted on:2024-04-23

Degree:Master

Type:Thesis

Country:China

Candidate:Z He

Full Text:PDF

GTID:2542307079959039

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

The problem of reliable shortest path(RSP)in stochastic transportation networks is a hot research topic at present,which is devoted to solving the actual demand of travelers with risk aversion consciousness.This thesis mainly studies SOTA problem which introduces time limits into RSP problem.Two algorithms are proposed from two perspectives of concentration inequality and reinforcement learning.In this thesis,a navigating algorithm based on fourth order moments(FMA)is proposed.The maximum likelihood estimation is used to estimate the first four order moments of the travel time of a given policy,and the tight lower bound of the given policy is calculated by the concentration inequality.Finally,the strategy is improved step by step by the generalized strategy iteration(GPI)to achieve the optimal one.Compared with the classical SOTA algorithms,this algorithm requires less input and does not need a complete travel time distribution.High computational efficiency avoids complex convolution calculation caused by iteration of travel time distribution.The algorithm complexity analysis shows that the calculation amount of FMA is small and is positively correlated with the scale of transportation networks.Experimental results in a series of traffic networks show that the performance of FMA is better than that of the existing technologies.On the other hand,a sample efficient generalized actor critic algorithm(SEGAC)is proposed from the perspective of reinforcement learning.Generalization strategy gradient(GPG)and extended value function estimation(E-VFA)in SEGAC can deal with the nonadditive property of SOTA problem.Then a weighted importance sampling and control variable module is proposed to reduce variance of SEGAC.Finally,a node embedding module can enable SEGAC to generalize solutions to new destinations.SEGAC has the characteristics of high sampling efficiency,small variance and strong generalization,and a pre-trained SEGAC network has efficient online decision-making ability.Experiments on a series of traffic networks show that the performance of SEGAC algorithm is better than the existing technologies and the classical AC algorithm.Generalization evaluation experiments show that SEGAC can output standard performance on new destinations.

Keywords/Search Tags:

Reliable Shortest Path, Stochastic On Time Arrival, Concentration Inequality, Reinforcement Learning, Actor Critic

PDF Full Text Request

Related items

1	Study On Adaptive Pid Control Strategy Based On Actor-critic Learning
2	Research And Implementation Of Actor-Critic Algorithm Model For Aircraft Autonomous Landing
3	Actor-Critic Reinforcement Learning And Applications To Automatic Ship Berthing
4	Research On Bidding Strategy Of Generators In Electricity Market Based On Asynchronous Advantage Actor-Critic Reinforcement Learning
5	Research On Improved USV Option Based Path Planning Method
6	Reliable Shortest Path Research Based On Big Data
7	Research On Local Path Planning In Vehicle-Road Cooperation
8	Asynchronous Generalized Advantage Actor-critic And Application In Automatic Driving
9	Research On Electric Vehicle Routing Problem Based On Reinforcement Learning
10	Research On Energy Management Strategy Of Fuel Cell Hybrid Vehicle Based On Improved Soft Actor-Critic Algorithm