In recent years,smart travel services such as ShenZhou and Didi are becoming increasingly popular.Different from traditional taxi services,smart travel services adopt dynamic pricing mechanisms to manipulate the supply and demand on the road,and such mechanisms improve service capacity and quality.Nevertheless,from the driver’s point of view,dynamic pricing also brings a new problem:how can drivers better search for passengers under the dynamic pricing mechanism.Seeking route recommendation has been widely studied in taxi service,such as machine learning-based hot spot recommendation,deep learning road recommendation,and reinforcement learning recommendation aiming at long-term benefits,but the traditional taxi industry itself has its own limitations.The disadvantage is that the research results can only be used as oral suggestions through radio,newspapers and journals,and cannot be used in real life.With the rise of intelligent travel services,the shortcomings of the traditional taxi industry have been well solved.However,in smart travel services,the dynamic price is a new and accurate indicator that represents the supply and demand condition,but it is yet rarely studied in providing clues for drivers to seek for passengers.In this thesis,we propose to incorporate the impacts of dynamic prices as a key factor in recommending seeking routes to drivers.We first show the importance and need to do that by analyzing real service data.Second based on passengers and vehicles GPS trajectory data set designed a Markov Decision Process model(MDP),simple probability method was used to simulate the dynamic pricing of fluctuations in the real environment,in the design for reward,and introduces the dynamic pricing mechanism,and use dynamic programming to solve MDP,but because of the complicated reasons of time.Hence,in order to solve this problem,in this thesis design a Q-learning model is designed to solve MDP model.The experimental results show that:(1)when seeking for passengers in the city center,the model will reasonably dispatch drivers to prevent all drivers from gathering in the specific area with high price multiplier,resulting in oversupply in the area.(2)When seeking for passengers in the suburbs,the model designed will help drivers to search for passengers in the nearest area,which effectively solves the problem of taking a taxi in the evening peak area.Due to the introduction of dynamic pricing,when the pickup probabilities are the same,faced with areas with low price multipliers and areas with high price multipliers,our model gives priority to let drivers seek for passengers in areas with high price multipliers.(3)On the driver’s income,due to the introduction of dynamic pricing,different drivers get different income,which better simulates the situation of real drivers searching for passengers,which will play a huge role in the introduction of multi-agent reinforcement learning to solve the competition problem in the future.By comparing the benefits of drivers using the traditional MDP algorithm model,the average benefits of drivers with the introduction of dynamic pricing MDP algorithm model have been further improved,which can be increased by up to 6%. |