Tunnel crash risk prediction is always a major research content of the traffic safety.The cross-river tunnel is the crucial node of the urban road transportation network,the two ends of the tunnel are connected to urban roads,the traffic flow and alignment are more complicated than general highway tunnel,and the traffic safety issues is also more prominent.Therefore,predicting crashes of urban cross-river tunnels accurately,excavating the internal causes of crashes,and taking effective measures to prevent potential crashes are crucial to improving the traffic safety of cross-river tunnels.Firstly,the data of crashes and influencing factors in two cross-river tunnels are collected of China.The tunnel was divided into 64 basic sections by zoning method and homogeneity method.Considering the impact of the aggregated data with different time granularity on the accident prediction,40898,5802 and 1346 data samples were integrated according to the natural day,week and month intervals,respectively.The distribution characteristics of the number of accidents in time and tunnel space are analyzed,and the multicollinearity and correlation of the influencing factors are tested.Then,a variable screening rule based on the relevance and importance of variables is proposed,13 variables were selected as explanatory variables finally.Four machine learning models: Random Forest,Multi-layer,Ridge regression and Xtreme Gradient Boosting were constructed,and two fixed parameter models: Poisson and negative binomial,were calibrated.In terms of goodness of fit and prediction accuracy,the non-parametric model and the parametric model are compared and analyzed.As a consequence,the prediction accuracy of XGBoost and Random forest is relatively high in the non-parametric model.In parametric models,the negative binomial performs better than Poisson.The prediction effect of crash data based on monthly set is better than that of crash data based on weekly set.Then,considering the influence of heterogeneity of crash data on the calibration results of the model,the fixed parameter negative binomial model was improved to the random parameter negative binomial model based on the assumption that the parameters were random parameters that changed with the samples.The goodness of fit of fixed parameter negative binomial model and random parameter negative binomial model was evaluated by likelihood ratio test.The results show that the random parameter model can better describe the heterogeneity of data,and the prediction effect is better.Finally,based on the calibration results of the negative binomial model with random parameters,the quantitative methods of craash influencing factors are presented:elastic coefficient method and random effect coefficient method.The influence degree of each significant factor on the accident under three different time granularity is calculated respectively.The results show that the average daily traffic volume,road section length,crash location(distance to tunnel entrance),longitudinal slope and year have the greatest influence on the accident,and the mechanism of each factor on the traffic safety is analyzed.The results of this essay can enrich the method system of crash risk prediction in urban cross-river tunnel,and also provide dependable basis for Department of Safety Management to formulate targeted safety improvement measurements. |