Font Size: a A A

Dynamic Pricing And Seat Allocation For High-speed Railway Based On Deep Reinforcement Learning

Posted on:2023-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q ZouFull Text:PDF
GTID:2542307073492204Subject:Transportation engineering
Abstract/Summary:PDF Full Text Request
With the development of China’s society and economy,the mileage of high-speed railway has been increasing due to their speed,safety and comfort.However,with the rapid development of high-speed railway,the China National Railway Group Limited is facing the problem of the amount of debt rising year by year.The ticket price and seat allocation of high-speed trains are currently in a relatively fixed state,and thus are not yet able to compete with other modes of transportation and maximize their revenue.In this paper,we study the dynamic pricing and seat allocation methods of highspeed railway based on the existing research results.Deep reinforcement learning is an advanced computer technique that can control the ticket price and seat allocation of high-speed trains.The main work of this paper is as follows.Firstly,passengers will choose among various transportation services.This paper constructs passenger utility in terms of transportation time and transportation fare,and describes passenger choice behavior through multinomial logit discrete choice model based on maximum utility theory.In the actual ticketing process,passenger demand is stochastic,so this paper accurately describes the ticketing process under random stochastic by Markov decision process,and establishes a model to maximize the revenue of high-speed railway ticketing.Next,a high-speed railway ticketing environment capable of using deep reinforcement learning is set up,in which the pricing and seat assignment constraints are embedded in the environment by setting up two components,state transfer and immediate reward,and reflecting the revenue maximization objective.In the process of obtaining a policy to cope with stochastic passenger demand,using traditional dynamic programming algorithms to solve the problem will face the problem of the curse of dimensionality,this paper proposes to solve the problem by deep reinforcement learning.In this paper,two deep neural networks,actor and critic,are constructed respectively,and the two networks are trained by Proximal Policy Optimization reinforcement learning algorithm.Finally,an environmental model is constructed with the data of ChengduChongqing high-speed railway to demonstrate the performance of the actor as the training process advances under two environments with different randomness.The trained actor is able to make ticket price and train seat allocation decisions based on the actual ticketing state.To demonstrate the superiority of the trained actor to random passenger demand,the trained actor is compared with a static set of decisions.In addition,there may be inconsistencies between the real environment and the training environment.In this paper,the adaptability of the proposed method to the real environment is demonstrated by testing the trained actor in a variety of random degree environments.
Keywords/Search Tags:Dynamic pricing, Seat allocation, Discrete choice model, Markov decision process, Deep reinforcement learning
PDF Full Text Request
Related items