| China is facing the challenge of severe air pollution,and the health of people in China is under serious threat.The study of adverse effects of air pollution to public health relies on precise knowledge of the spatial distribution of air pollutant concentrations as basic data inputs.Land use regression(LUR)has been widely used in city-scale air pollution exposure modelling.In recent years,LUR has been extended for use at national scale.However,accurately explaining the spatial and temporal variations of air pollutant concentrations at large spatial scale is still a big challenge as there is no methodological study in China for building national scale LUR models.In this study,the construction of a GIS database was first explored.Categorized points of interest(POI)variables were introduced to represent information on emission sources,while the boundary-layer-height-averaged wind speed(BLHA-WS)was utilized as a proxy of diffusion condition of air pollutants.An automatic system for constructing the datasets for LUR models was proposed and a 1km×1km grid-based national scale dataset which consists of 259 variables was generated.It was proposed to build a spatial model for air pollutants using universal kriging(UK)with forward variable selection.The usability of different cross-validation(CV)approaches,the stability of the performance of linear LUR models,the effects on the model performance by introducing meteorological variables,remote sensing observations and UK in the models,and within-city variations of air pollutants were further discussed.The study also compared the performance of forward-selection-based models and partial-least-squares-based models,and quantitatively evaluated the uncertainties of national predictions from the final UK models.In order to model temporal variations of air pollutants,it was proposed to add temporal random effects to the random forests regression to build a spatiotemporal model for NO2.The model performance was intensively analyzed.The research obtained 1km resolution nationwide predictions of annual PM2.5 and NO2 concentrations from 2013 to2017,and spatiotemporal continuous monthly NO2 concentrations from 2014 to 2017.Spatial and temporal distribution characteristics and human exposure status of PM2.5 and NO2 in China were then analyzed in detail based on the nationwide predictions.The results show that the categorized POI variables and BLHA-WS played important roles in the models.The NO2 concentrations were more sensitive to land-use related variables,while PM2.5 concentrations were more correlated to meteorological variables.300~400 samples were sufficient to build a stable national-scale linear LUR model in China with little risk of overfitting when including meteorological variables and satellite data.Satellite data and UK were complementary in making predictions more accurate:satellite data substantially improved performance at locations far away from monitors by providing information of background concentrations,while UK improved the models in well-sampled areas by adjusting the residuals from the linear trend.The annual model based on forward selection and UK yielded 10-fold CV R2’s of0.87~0.92(PM2.5)and 0.73~0.79(NO2),which was slightly better than the performance of the partial-least-squares-based models.The monthly NO2 model based on mixed-effects random forest yielded a sample-based 10-fold CV R2 of 0.82 with good ability of temporal extrapolation,which had a better performance than the random forest model.Based on the prediction results,we found that from 2013 to 2017,the populated-weighted average PM2.5 concentrations in China consistently decreased over those five years,while the population-weighted average NO2 concentrations in China rose after first showing a fall trend.Most people in China are still living in an air environment which is extremely detrimental to human health.NO2 pollution is severe in the three major city agglomerations in China,and this requires urgent attention. |