Font Size: a A A

Improving The Accuracy Of Ozone Prediction Based On Machine Learning In China

Posted on:2024-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:K L XiongFull Text:PDF
GTID:2531307106475394Subject:Resources and environment
Abstract/Summary:PDF Full Text Request
Severe near-surface ozone(O3)pollution poses a significant threat to residential health,ecosystems,climate change,vegetation and buildings.Accurate O3 predictions can better assess its impact on public health and help develop effective prevention and control measures.Data from ground-based stations are the most accurate,but the number of ground-based stations is small and unevenly distributed.The simulations from the air quality model provide complete spatial and temporal coverage,but there are large bias between the simulations and the observations.Firstly,the bias and influencing factors of the Community Air Quality Model(CMAQ)simulations were analysed and a bias correction model was constructed based on the Random Forest(RF)algorithm.The RF model successfully captured the non-linear relationship between O3 and its influencing factors.The standard mean bias(NMB)of hourly O3concentration(O3-1h),the daily maximum 8h O3(O3-Max8h)and the daily maximum 1h O3(O3-Max1h)decreased from 15.8%,20%and 17%to-0.5%,0.8%and 0.1%,respectively,and the correlation coefficient(R)improved from 0.78,0.90 and 0.89 to 0.94,0.95 and 0.94,respectively.The causes of the bias in CMAQ simulated O3 were also explored.For O3-1h,the bias of nitrogen dioxide(NO2)may be the main cause.For O3-Max8h and O3-Max1h,the observations are the main cause of the bias.Two multi-source data prediction models based on the Light GBM algorithm were then constructed to improve the accuracy of the CMAQ model for O3-Max8h.The first model uses pollutant concentrations simulated by CMAQ,meteorological data simulated by the Weather Research and Forecasting Mode(WRF)and latitude and longitude data as input variables(named LGBR),while the other model uses the same setup but uses the O3-Max8h provided by the China High Air Pollutants(CHAP)dataset as an additional input variable(named LGBR_CHAP).The results showed that the root mean square error(RMSE)and mean bias(MB)of the LGBR model(LGBR_CHAP model)were reduced by 3.15μg/m3 and 2.07μg/m3 at the daily scale(5.61μg/m3 and 4.18μg/m3),respectively,compared to the original CMAQ model.At the monthly scale,the R of the CMAQ model was improved from 0.2 to 0.91 to 0.4 to 0.92(0.5 to 0.94)after optimization of the LGBR(LGBR_CHAP)model.Spatially,the O3-Max8h simulated by the CMAQ model performed better in East China but worse in West China.After optimisation of the LGBR and LGBR_CHAP models,the CMAQ model national station-averaged R improves from 0.77 to0.83 and 0.88,respectively.the LGBR and LGBR_CHAP models have successfully captured the spatial and temporal patterns of O3-Max8h.Overall,both the LGBR and LGBR_CHAP models perform better than the original CAMQ model,but the LGBR_CHAP model has better predictive power than the LGBR model.Therefore,the LGBR_CHAP model was used to predict O3-Max8h for the whole country.the LGBR_CHAP model successfully predicted O3-Max8h data with high resolution(10km×10km)and full coverage(100%).
Keywords/Search Tags:Machine learning algorithms, Air quality models, Ozone, Spatiotemporal distribution
PDF Full Text Request
Related items