| With the increasing intensification of travel demand and the volatility of road operating conditions,short-term forecasting of conventional bus demand has become a research hotspot.Reliable short-term passenger flow prediction can help adjust the schedule of conventional bus lines and provide services for diversified passenger travel,thereby improving passenger satisfaction and economic benefits.Data volatility directly affects the prediction accuracy.The bus passenger flow data is mainly calculated based on the bus swipe data,and the swipe data is directly affected by the arrival of the bus.Therefore,in order to reduce the adverse impact of data volatility and reduce the prediction error,this paper uses the number of vehicle arrivals,Time period,passenger flow in the adjacent period,working days and non-working days are taken as feature inputs,and a short-term bus passenger flow prediction model based on XGBoost is constructed.main tasks as follows:(1)Calculate passenger flow data.Since the passenger’s card swipe data is affected by the arrival of the vehicle,it cannot truly reflect the actual situation of the passenger’s arrival.In order to obtain real passenger flow data as much as possible,on the basis of assuming that passenger arrivals are evenly distributed,this paper evenly distributes the number of passengers arriving at a station to the time period from the previous departure to the current flight arrival as the passenger flow data.Through comparison and analysis with the passenger flow time distribution curve of the number of passengers on the station,it is found that the calculated passenger flow data curve is intuitively more stable.(2)Analysis of passenger flow fluctuation and its influencing factors.First,the volatility of short-term passenger flow was analyzed;further,through the correlation analysis of the number of vehicles arriving on multiple routes and the passenger flow of the selected route,the influencing factors of passenger flow fluctuation were explored,and the main factors affecting the fluctuation of passenger flow were The number of vehicles arriving;Finally,the time distribution of short-term passenger flow is studied,and the correlation between the passenger flow in the first four periods in the vertical direction is relatively large,and the correlation coefficient is above 0.4.The distribution is significantly different and changes from time to time.In summary,this paper takes the number of vehicle arrivals,time period,passenger flow before the adjacent period,working days and non-working days as the characteristic input of the short-term passenger flow prediction model.(3)Build a short-term bus passenger flow prediction model.In the model input data set,the number of vehicle arrival variables is too large,and the number of time period variables is too large and sparse.Because XGBoost can effectively solve the problem of high latitude and sparse data,and thus improve the efficiency of the algorithm,this paper proposes to establish an XGBoost model for short-term passenger flow prediction.In the process of model construction,first,feature engineering is carried out to convert the features selected in this paper into an input data set suitable for model learning.Among them,the period features are categorical variables and the number of variables is large,which requires one-hot encoding operation;Further,the model is adjusted,and the network search cross-validation method is used to select the optimal values for the general parameters,promotion parameters,and learning task parameters in the XGBoost model.(4)Analysis of calculation examples.In this paper,the relevant data of a total of 30 lines of conventional bus single-peak type,double-peak type and other types of Guangzhou city are selected 27 days before the month to construct the XGBoost prediction model,and the relevant data after 3 days is selected as the test set to verify the model results.First,the model validity analysis is carried out.By comparing with the KNN regression model,BP neural network model,and LSTM model,the average absolute percentage error of the XGBoost model constructed in this paper is increased from 61.08%,50.85%,44.11% to 42.9%,and the calculation time is 11.78 s,40.21 s,28.15 s increased to 10.63 s,from which XGBoost prediction model can more quickly and accurately predict the short-term passenger flow of conventional buses;in addition,when the number of vehicle arrivals is not used as a feature input,the average absolute percentage error of the model drops to 55.86%,The vehicle arrival number feature obtained as a model feature input can effectively improve the prediction accuracy.Finally,the sensitivity analysis of the model is performed.The average absolute percentage error of the XGBoost prediction model during peak hours reaches 30.26%,which is better than 43.9% during low peak hours.Therefore,the XGBoost prediction model constructed in this paper has a good prediction effect during peak hours. |