Font Size: a A A

Prediction And Analysis Methods For Delays Caused By Traffic Accidents

Posted on:2024-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z H WangFull Text:PDF
GTID:2542307076497434Subject:Transportation planning and management
Abstract/Summary:PDF Full Text Request
Traffic accidents cause significant traffic delays and unnecessary financial losses.The mechanism and key characteristics of traffic delays caused by accidents will be different compared to other impacts caused by accidents.The main objective of this paper is to develop methods for prediction and analysis of delays caused by traffic accidents and to validate the reliability of the proposed methods using Texas 2020 accident data.By establishing prediction method for delay caused by accidents,it facilitates the timely acquisition of a more accurate accident posture.The analysis method based on the delay caused by the accidents will help to understand the formation mechanism of the accident,so as to implement targeted management policies and preventive measures.The main work of this paper is as follows:(1)Research on the prediction method of delays caused by traffic accidents considering regional differences.A framework for predicting delays caused by traffic accidents based on Random Forest(RF)and Light GBM(LGBM)was developed and performed in different accident areas.First,the most relevant variables in different regions are identified by recursive feature elimination based on Logistic Regression.Then,the RF and LGBM models after grid search hyperparameters are used to predict the delays caused by accidents in different regions separately.Finally,two threshold moving methods using maximizing G-mean and F1-score as targets are used to solve the data imbalance problem.The results show that the prediction framework based on RF performs better in metropolitan areas,while the prediction framework based on LGBM performs better in non-metropolitan areas.The results of the SHAP values show that highways,spring and sunrise are the main factors for higher delays caused by accidents in these two regions.Excessive wind speed and temperature in metropolitan areas can lead to higher accident delays,while in non-metropolitan areas it is pressure and apparent temperature.(2)Research on the prediction method of delays caused by traffic accidents based on Stacking model.The performance of the prediction is further improved by establishing a framework for predicting delays caused by traffic accidents based on the Stacking model.In the two-layer Stacking model,seven different base classifiers are used in the first layer and three meta-classifiers with different advantages are tested in the second layer.The Stacking model is improved and simplified by three advanced methods including Bayesian hyper-parameter optimization,multi-objective feature selection,and ensemble selection.Specifically,Bayesian optimization and feature selection select the appropriate hyperparameters and the smallest and most efficient subset of features for the first layer base classifier,respectively.Ensemble selection considering diversity and model performance selects the most efficient base classifier to reduce the input of the second layer.The results show that the Stacking model significantly outperforms any of the base classifiers.Feature selection significantly improves the efficiency of the Stacking model and ensemble selection facilitates the acquisition of a simpler Stacking model.In addition,the combined optimization of feature selection and ensemble selection tends to obtain the optimal Stacking model.The final permutation importance identified six accident features that play an important role in prediction.(3)Factors analysis on delays caused by traffic accidents based on latent class analysis and XGBoost-SHAP.The latent class analysis captures unobserved heterogeneity in the accident data and splits the entire dataset into several homogeneous clusters.XGBoost-SHAP models were then developed on each cluster to quantify the contribution of incident features hidden in the potential classes including single factor effects and interaction effects.The results of the latent class analysis showed that season was the main factor generating data heterogeneity,so the accident data were divided into four clusters.The XGBoost-SHAP model shows that the main contributing factors and interaction effects are different in the four clusters.Sunrise/sunset,peak hours,and crossing are the main contributors to fall and winter accidents,while traffic signals,weekdays,and junction are the main contributors to summer and spring crashes.The interaction effect between highway and zone is different in the fall and winter crashes.Based on these results,targeted management recommendations are provided to regulatory agencies for different seasons.
Keywords/Search Tags:Delays caused by traffic accidents, Ensemble learning, Stacking model, SHAP values, Latent class analysis, Data heterogeneity
PDF Full Text Request
Related items