In recent years,with the increasing number of haze events,visibility-related research has received wide attention,especially for low visibility.Long periods of lowvisibility can increase the incidence of road traffic accidents and pose a great threat to traffic safety.Low visibility is also accompanied by high concentrations of pollutants,which can pose serious risks to human health and environment.Therefore,accurate and efficient prediction of atmospheric visibility is of great practical significance for the safety of people’s lives and properties.Qingdao is a typical pollution background city and its visibility is closely related to atmospheric pollution.This paper proposes two visibility prediction schemes,parameter optimization and model optimization,to improve the accuracy and efficiency of visibility prediction,after analyzing the temporal characteristics of visibility in Qingdao.(1)The monthly variation characteristics and hourly intra-day variation characteristics of visibility under different seasons in Qingdao were analyzed.It can be found that there were significant seasonal differences in the daily time-varying characteristics and the daily range mean values of visibility;furthermore,the causes of seasonal differences in visibility were analyzed by combining meteorological and pollutant parameters,and it can be found that they were closely related to the seasonal differences in pollution sources affecting visibility in Qingdao.Therefore,in order to avoid the influence of seasonal differences of pollution sources on visibility prediction,we predict visibility after seasonal feature selection.(2)The optimization of visibility prediction parameters was carried out in this paper to light-weight data requirements in practical application scenarios and improve the application scope of visibility prediction.In this paper,the performance of five common machine learning methods,including eXtreme gradient boosting(XGBoost),light gradient boosting machine(LightGBM),random forest(RF),support vector machine(SVM)and multiple linear regression(MLR),is compared under different training parameter schemes.Two lightweight visibility prediction schemes based on different application scenarios are established.The five machine learning methods were unified for data pre-processing and seasonal training,prediction and performance evaluation.The analysis results show that in the application scenario of pollutant parameter optimization,the visibility prediction scheme with 6-parameters using meteorological parameters and PM2.5 based on XGBoost or LightGBM model is preferably established in this paper.This scheme can achieve the same prediction performance as the 11-parameter prediction scheme.The correlation coefficient(CC)of the results is around 0.85.In the application scenario of meteorological parameter optimization,the visibility ensemble prediction scheme established in this research can improve the correlation coefficient of the prediction results to 0.68-0.76.At the same time,it can be found that the parameter-optimized visibility prediction scheme can effectively reduce the data requirements,but there is still room for improvement in the prediction performance in low-visibility scenarios.To further improve the prediction performance in low-visibility scenarios,model construction methods need to be optimized.(3)In terms of model optimization,this paper focuses on improving the accuracy of machine learning in visibility(V)prediction under different pollution scenarios.A new atmospheric visibility prediction method based on the stacking fusion model(VSFM)is established in this paper.The new method uses the stacking strategy to fuse two base learners(XGBoost and LightGBM)to optimize prediction accuracy.Furthermore,seasonal feature importance evaluations and feature selection were utilized to optimize prediction accuracy in different seasons with different pollu-tion sources.The new VSFM was applied to 1-year environmental and meteorological data measured in Qingdao,China.Compared to other traditional non-stacking models,the VSFM model performed better in all four seasons.The threat score(TS)of the VSFM was significantly better than that of other models,especially in extremely low-visibility scenarios(V<2 km).For extremely low-visibility scenarios,the TS of VSFM was 0.5,while the best performance of other models was less than 0.27.This model optimization scheme can effectively compensate for the shortcomings of parameter optimization schemes in low-visibility scenarios.This paper improves the efficiency and accuracy of machine learning in visibility prediction by parameters and model optimization schemes.The results of this study can not only be used to provide a machine learning scheme reference for practical visibility prediction applications,but also help to deepen the understanding and knowledge of the factors affecting visibility. |