Font Size: a A A

Study On Train Delay Clustering And Classification Prediction Of Guangzhou-Shenzhen High-speed Railway

Posted on:2022-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:R HuFull Text:PDF
GTID:2492306740950319Subject:Traffic and Transportation Engineering
Abstract/Summary:PDF Full Text Request
The construction and operation of intelligent railways is the next stage of development and research direction of china’s high-speed railways,which puts forward higher requirements for more automated and intelligent processing and analysis of train operation data.The delay of high-speed railway trains will lead to a decline in the quality of service provided to passengers by the railway transportation department.It is necessary to explore the reasons for the delays of different types of trains and the propagation laws of different types of trains.Comprehensively the development direction of intelligence and the urgency of the research on the delay of high-speed trains,this paper conducts a clustering study on the delay trains based on the original running record data of the Guangzhou-Shenzhen high-speed trains.The four different types of delayed trains are different in the delay time,the cause of delay,and the law of delay propagation.This will improve the pertinence of the classification of delayed trains and provide auxiliary decision-making for quickly judging the development trend of train delays.The main jobs and conclusions are as follows:First,the general situation of Guangzhou-Shenzhen high-speed railway is introduced,descriptive statistics of train data are carried out,outliers are analyzed and eliminated,and the distribution of various characteristic parameters of GuangzhouShenzhen high-speed railway trains is revealed by visualization methods.The Lorenz curve quantifies the consecutive delays of trains on the Guangzhou-Shenzhen highspeed railway.Then,according to the complex data structure,the correlation analysis of each delay feature parameter is carried out,and each delay feature is divided into stationlevel features and section-level features,and then the two major types of features are standardized,making it more suitable for related algorithms.Then the data features were screened,and the ten most important feature parameters were selected for dimensionality reduction under the premise of losing less than 5% of the original data connotation,and finally the high-dimensional train data was reduced into three dimensions.Then input the data into clustering models such as K-means and BIRCH respectively,first use the contour coefficient to control the number of output clusters of each clustering model,so that the clustering effect is good in each clustering model,and then introduce the CHI and DBI indicators to evaluate the output results of each clustering model.Finally,four types of delayed trains based on BIRCH output were selected for the comprehensive evaluation index and sample number of each delay type.The characteristics of the four types of delayed trains were analyzed,and it was found that the four types of delayed trains had differences in delay time and also showed obvious differences in the causes of delays.The delayed trains were divided into severely delayed trains,moderately delayed trains,slightly delayed trains and delayed recovery trains,and based on this,proposed a standard for the classification of train delays on the Guangzhou-Shenzhen high-speed railway.Finally,the delay time of four types of delayed trains is predicted,and the delay propagation law of four types of delayed trains is analyzed respectively,and the delay situation of the four types of late trains in the subsequent operation is evaluated,and the macro law of the delay propagation of different types of trains is obtained.Based on the gradient boosting decision tree algorithm,the delay time of various types of trains is predicted.After the model is adjusted and optimized,the results show that the model has strong generalization ability.At the same time,the train delay prediction error after classification is higher than the delay time prediction without prediction.The error is reduced by 67%.The model has good prediction accuracy for the four types of delayed trains,and a control model is set for comparison.
Keywords/Search Tags:Guangzhou-Shenzhen high-speed railway, feature engineering, train delay clustering, machine learning, train delay prediction
PDF Full Text Request
Related items