| The urban public transportation system is an organic entirety composed of multiple public transportation modes.Both urban rail transit and conventional public transportation are important components of the system.The impact of new-constructed subway on bus passenger flow is complex.The distance between bus stops and subway stations,the positional relationship between bus lines and subway lines,and the nature of land use around subway stations will cause different changes in the surrounding bus passenger flow after subway opening.If the passenger traffic of surrounding buses can be reasonably predicted before the opening of the subway,the bus schedule can be changed to use public transportation resources effectively.Based on the relationship between bus stops and subway stations,this paper uses machine learning and statistical analysis methods to study the impact of subway opening on bus passenger flow,and establishes a model of surrounding bus stops after subway opening.An example is verified by using the changes in the traffic volume of the surrounding buses before and after the opening of Beijing Metro Line 6 and the southern section of Line 8.The main research work is divided into the following three parts.(1)A machine learning-based model called BPMFS(Prediction of Bus passenger Flow under new Subway line based on Machine learning)is established.Firstly,this paper select and construct features that may affect the passenger flow of the bus.Then MIC is used to evaluate the importance of the features,and the features are screened according to the evaluation results.Finally,the integrated regression tree model(Xgboost)with optimal hyperparameters is established for prediction with Bayesian optimization.(2)Aiming at the problem that it is difficult to identify the passenger flow patterns of bus stops in the BPMFS model,a clustering model of bus stop passenger flow sequences(W-shape)is proposed.According to the general nature of the time series of bus passenger flow,the algorithm designs RSBD(Regular and Shape Based Distance)as time series correlation measurement method,and designs a new cluster center extraction method SBC based on the comprehensive consideration of the time series shape and amplitude.Clculations shows that W-shape is an effective clustering model,which can be used in the identification and clustering of bus passenger flow sequences efficiently.(3)In the empirical research,data source,data cleaning method and data status in this paper are explained in detail,and the characteristics are constructed according to the actual data,the BPMFS model is applied to the empirical.Taking the extension of Beijing Metro Line 6 as the model training set and the southern section of Line 8 as the model verification set,the MAPE of the verification set is 0.059,RMSE is 4.607 and R2_score is 0.938 which proves that the model can be effectively applied in practice. |