Font Size: a A A

Study On Prediction Of Highway Traffic Accident Severity

Posted on:2024-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2542307157969789Subject:Transportation
Abstract/Summary:PDF Full Text Request
The China Statistical Yearbook shows that property losses and fatalities in highway traffic accidents in China are increasing year by year with the rapid growth of motor vehicle ownership.Analyzing and predicting the severity of traffic accidents is beneficial for providing targeted accident prevention plans and measures,and reducing the severity of accidents.Based on the Kaggle public data set of traffic accidents in the United States,this thesis analyzes the factors affecting the severity of traffic accidents and the correlation between each feature and the severity of accidents,and studies the severity of highway traffic accidents by combining random forest algorithm,XGBoost algorithm and logical regression algorithm.First,this thesis describes and analyzes the content,format,and quantity of accident data,and preprocesses the accident data.Then,it analyzes the influencing factors of highway traffic accidents from four aspects: human,vehicle,road,and environment.Since fatal accidents account for only 3% of the entire accident data,the SMOTE algorithm is used to oversample the fatal data.Random forest,XGBoost,and logistic regression algorithms are used to construct a traffic accident prediction model,and the performance of each model is studied under different oversampling ratios.Next,important features are selected using a confusion matrix,feature importance from the random forest algorithm and XGBoost algorithm,and cumulative feature importance curves.The best parameters for each model are determined using cross-validation and grid search.The accuracy,F1-score,recall rate,and AUC are used to compare the performance of the three models.Finally,the random forest prediction model is selected for visual interpretation.Referring to the results of the model’s visualization analysis,targeted improvement measures are proposed from the perspectives of human,vehicle,road,and environment.The main work is as follows.(1)This thesis employs three algorithms,namely Random Forest,XGBoost,and Logistic Regression,for modeling.The data is balanced using the SMOTE algorithm.The results indicate that the highest predictive accuracy for binary classification(non-fatal accidents and fatal accidents)is achieved when the sample ratio is 1:1.Similarly,for the three-class classification(property damage only,injury accidents,and fatal accidents),the highest accuracy is obtained when the sample ratio is 1:1:1.(2)A comparison of the three algorithms for predicting the severity of highway accidents reveals that the Random Forest algorithm performs the best in both binary and three-class classification scenarios.(3)This thesis analyzes the factors influencing highway traffic accidents and visualizes the Random Forest model using SHAP and PDP dependency plots.The results demonstrate that factors such as time,latitude and longitude,weather,and visibility have a significant impact on the accident severity.Targeted improvement measures are proposed based on the visualization results.
Keywords/Search Tags:Highway, severity of accidents, random forest, XGBoost, logistic regression
PDF Full Text Request
Related items