With the continuous improvement of Chinese High-speed Railway construction,the impact of delay on organization of train operation and station operation is increasing day by day.Accurate prediction of delay is the basis of intelligent dispatching and improvement of transportation service level of High-speed Railway.However,the existing research is not clear enough to describe the regular pattern of delay under specific delay scenarios,and the analysis and identification of influencing factors of delay are not sufficient.The accuracy and applicability of prediction model need to be further improved.Therefore,this paper analyzes the pattern of High-speed Railway train delay,identification of factors affecting train delay and the construction of prediction model under multiple scenarios from the way of the propagation process of High-speed Railway train delays.The main work are as follows:(1)Analyze the regular pattern of High-speed Railway train delay and identify the influencing factors.Firstly,the data are preprocessed to obtain the delay data sets of primary delay scenario and off-line train delay scenario.On this basis,the primary delay scenario is refined by clustering and the train delay pattern is analyzed in detail,which is used as the basis for selecting the factors affecting delay.Finally,feature engineering oriented to the identification of influence factors of delay is constructed,which maximumly reduce the dimension of influence factors by 68.57%.The problem of identification of influencing factors and feature set construction of prediction model in primary delay scenario is solved.(2)Construct arrival delay prediction model considering primary delay.Combined with the identification results of influencing factors,the prediction problem considering the primary delay is sorted out,and the prediction target of the arrival delay of the next station is determined.Secondly,a delay prediction model based on Improved Deep Neural Network(IDNN)is constructed to improve the delay prediction efficiency when multidimensional factors are input.Finally,the data of primary delay is used to verify the case.The results show that the feature engineering maximumly improves the accuracy by 12.62%and maximumly reduces the training time by 54.85%,and effectively identifies the factors affecting train delay in the primary delay scenario.The constructed IDNN model has better prediction performance than other baseline models,and the high-precision prediction of the arrival delay of the next station is realized.(3)Construct arrival delay prediction model considering off-line trains.On the basis of the identification of influencing factors and the construction of the next station delay prediction model,the prediction target of multi-station arrival delay is clarified considering that the influence range of off-line train delay is wider and the prediction range needs to be realized is larger.Combined with the operation process of off-line train in the station,the influencing factors of off-line train delay are further extracted.In order to solve the problem that IDNN model can not automatically identify the influencing factors and the single model fitting ability is limited,a prediction model integrating Embedded and Stacking is established.Finally,the data of off-line train delay is used for case verification.The results show that compared with IDNN model and other baseline models,the proposed prediction model has better performance and solve the multistations arrival delay prediction problem.(4)Develop a Django-based delay prediction visualization system.The code of data processing,feature engineering and prediction model is used as background algorithm,B/S architecture and Python language is used to write,Element-UI and Pyecharts are used to develop front-end interface,and the delay pattern analysis,influence factor identification and delay prediction results of the paper are presented to the Web end. |