| Modern urban public transportation has become the main travel mode for urban residents.In-depth understanding of the travel patterns of millions of passengers in the public transportation system and their spatial and temporal characteristics is an important cornerstone of urban road network planning,emergency management,operation scheduling and public services.At present,most of the research on urban commuting only focuses on the rail transit system or bus system,and the analysis of a single data source cannot objectively reflect the whole picture of the spatiotemporal characteristics of the entire urban commuting.This study uses two types of data sources(bus and subway ticket data)to conduct a full-link commuting analysis of Shenzhen’s public transportation system.First,in view of the data quality problems existing in the data,corresponding methods for improving data quality and data semantics are proposed.Secondly,by correlating the passenger bus card data and the GPS track information of the bus through clock calibration,the passenger bus boarding station is inferred;the travel link assumption and probability model are used to infer the passenger bus alighting station.Combined with the OD information of the passenger subway,the transfer behavior and travel mode of passengers traveling by public transportation are judged.Then,the spatiotemporal characteristics of commuting,the distribution of jobs and residences,and the spatiotemporal characteristics of passenger flow in different travel modes of passengers are further studied.Finally,according to the spatiotemporal characteristics of the passenger flow in and out of the subway station,the influence characteristics of the passenger flow in and out of the station are analyzed and extracted.In particular,for outbound passenger flow prediction,an extraction based on spatiotemporal causality is proposed.A neural network model based on Long Short Term Memory(LSTM)was built to predict the passenger flow of rail transit in and out at four time granularities(5min,10 min,15min,30min),and compared the predictive performance of ARIMA,BP,SVM model.In this study,a Spark cluster built with 16 nodes is used to complete the above experiments using Spark technology and a very large-scale real traffic dataset.The experimental results show that:(1)The average daily trips of only the bus,only the subway,and the combination mode all reach the level of one million.Therefore,a single data source cannot reflect the overall commuting spatiotemporal characteristics of urban residents.Specifically,the commuting time of 90% of bus passengers and subway passengers is within 38 minutes and 50 minutes,respectively,and the commuting distance is within 12 km and 18 km,respectively.90% of the passengers travel in combination mode,the commute time is about 95 minutes,and the commute distance is about 20 km.Second,the average commute time and distance in the central area is shorter;the average commute time and distance in the outer area is longer.(2)Identify the distribution of residents’ jobs and residences according to the characteristics of passengers’ commuting and travel.Urban residents work mainly in Nanshan,Futian,Luohu and other areas,and their residences mainly extend along the rail transit network to Bao’an,Longhua and Longgang areas.(3)The passenger flow of subway stations has obvious morning and evening peaks in the time-sharing passenger flow on weekdays,and the time-sharing passenger flow is relatively flat on weekends;in the long-term span,there is a cyclical fluctuation with a cycle of one week.The large passenger flow of the station is mainly concentrated in the central business district and the hotspot transfer station;the large passenger flow of the station OD is concentrated between the residence and employment,the business district and the hotspot transfer station.(4)For the inbound passenger flow samples based on historical time series characteristics,the LSTM model has the best prediction effect,and the inbound passenger flow accuracy of 15 minutes reaches 84%;for the outbound passenger flow samples based on historical time series characteristics,the LSTM model predicts better than the ARIMA model;For outbound passenger flow samples based on causal spatiotemporal features,the LSTM model has better prediction effect than BP model and SVM model,and is also better than the LSTM model based on outbound passenger flow samples based on historical time series features.The prediction accuracy of outbound passenger flow in 30 minutes reaches 92%. |