| With the accelerated pace of urban modernization process,implementing the prior development strategy of public transport and building intelligent public transportation system become an important way to solve urban traffic problems and environment pollution.Bus arrival time prediction is the most import measure to realize the informationization of public transportation.The arrival time of bus,the most concerned information to travelers,can be used to arrange trips.Also,it will help transit agencies with dispatching buses effectively.Intuitively,many stochastic factors affect the predictability of arrival time,such as,traffic condition,weather and local events.Besides,the bus arrival time essentially is the multi-step-ahead predication task.Inspired by above two observations,this paper investigates a method from the perspective of exploiting long-range dependencies from heterogeneous measurements,and propose to use Recurrent Neural Network(RNN)to model the multi-step-ahead arrival times.Since there is no widely used dataset,we first build a bus trip dataset from GPS data.The filtered GPS data in conjunction with the static information are mapped into the arrival time-travel distance data pairs.To our knowledge,the dataset released here is both largest and unbiased to easily reproduce the results.We second investigate heterogeneous data,including the static data(statistics of the infrastructure)and the dynamic measurements(historical trajectory data),exploring the correlation between bus arrival time and those corresponding factors,either directly or indirectly.One hot coding is introduced to distinguish different locations and fuse these heterogeneous measurements into a vector space and then organized to bus trip sequence.Distinguished from the idea that predicting with statistical and regression methods based on single data,this paper adopt RNN with Long Short-Term Memory(LSTM)to capture the transportation long-range dependencies hidden in the bus trip sequence.The correlation between the changing traffic and bus arrival times.With the running of bus,the LSTM network will unfold and be able to correct predications at the multiple steps ahead with new generated inputs.The dynamic adaptability of LSTM RNN guarantee a good performance compare with the state-of-the-art methods.Finally,we introduce taxi OD data into the task of bus arrival time prediction.Human’s travel is the key cause of road traffic variation.Based on the taxi OD data mining,we discovered some evidence,trip intensity,of the regularity change of urban resident travel pattern.Trip intensity,as prior knowledge,can improve the predictability of LSTM based algorithm proposed by us.How to incorporate temporal-spatial relationship into our predication approach is an interesting and important direction in the future. |