Research On Train Delay Prediction Method Based On Feature Selection And Machine Learning

Posted on:2021-04-29

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Ji

Full Text:PDF

GTID:2392330614971443

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

With the continuous improvement of train speed in China,passengers are demanding higher train punctuality.However,according to the actual operation,trains will deviate from the trajectory of the planned diagram due to some interference factors.At present,dispatchers estimate the train delay time based on their own dispatching experience and adjust the train diagrams online.If the train arrival delay time can be predicted in advance,it can provide more accurate information to dispatchers and further improve the scheduling optimization effect.In this paper,based on the actual delay cases,the rule of train delays in the case of disturbance was studied,and the prediction methods of train arrival time was designed.The main contents of the paper are as follows.(1)Based on the description of the train delay cases,TF-IDF text mining method was selected for keyword extraction.The factors affecting the train delays were analyzed in combination with the actual operation of the train.According to the actual survey and research status,the features of train delays were sorted out.By collecting train schedules and line information,and combining the train delay cases,the features were numerically processed to construct the train delay data set.(2)The correlation and redundancy of the features were analyzed.For weak correlation and redundant features,an improved feature selection algorithm based on Max-Relevance and Min-Redundancy(m RMR)was proposed.The maximal information coefficient(MIC)was used to replace the original mutual information as the evaluation criterion for the correlation of variables.The evaluation criterion of the fusion of MIC and Spearman coefficient was designed to improve the shortcomings that mutual information was insensitive to discrete values and the measurement criterion was single.The effectiveness of the MS-m RMR algorithm on delay data set was proved by comparing with the prediction accuracy of the feature set selected by the original m RMR algorithm.(3)Based on the delay data set,the random forest(RF)algorithm,the gradient boosting decision tree(GBDT)algorithm and the extreme gradient boosting tree(XGBoost)algorithm were selected to establish the regression prediction model of delay time.The decision coefficient R~2 was used as the weight to improve the random forest algorithm.The accuracy of weighted random forest(w RF)and RF was compared on the delay data set.Particle swarm optimization(PSO)and grid search algorithm(GS)were used to optimize the model hyperparameters.The optimal algorithm combination was obtained by comparing the accuracy of the three optimized prediction models.(4)A train delay prediction system based on Django was developed.The system used the optimal algorithm combination as the background prediction engine module to implement the prediction model researched in this paper.There are 35 figures,26 tables,and 78 references in this paper.

Keywords/Search Tags:

Delay Prediction, Machine Learning, Feature Selection, Train Delay Factors, XGBoost

PDF Full Text Request

Related items

1	Intelligent Prediction Of Train Delay Changes And Propagation In High Speed Railways
2	Study On Train Delay Clustering And Classification Prediction Of Guangzhou-Shenzhen High-speed Railway
3	Reseach On The Train Delay Analysis And Prediciton Method Of High-speed Railway Based On Data-driven
4	Empirical Analysis Of Machine Learning Classification Algorithm To Flight Delay Data
5	High-Speed Train Delay Prediction Method And System Based On Extreme Learning Machine
6	Research On Flight Delays Prediction Methods Based On Machine Learning
7	Study On The Train Delay Prediction Model By Using The Real Operation Data Of The Dutch Railway Network
8	Delay Analysis And Prediction Of Actural Train Performance
9	Research On Train Delay Prediction Based On Delay Classification And PSO-GBDT
10	Research On Delay Prediction And Operation Adjustment Of High-Speed Railway Trains Under Rainfall Conditions