With the development of China’s economy,China has become one of the most polluted areas in the world with PM2.5 pollution.Spatiotemporal Analysis of changes of PM2.5 in the region and the relationship between PM2.5 and meteorological factors are of great significance for the management of PM2.5 pollution.At the same time,it is also an important step to predict the missing or abnormal values of past PM2.5 data and the future PM2.5 concentration.Predicting the missing or abnormal values in the past can build a long-term and high-precision PM2.5concentration data set.And predict the future PM2.5 concentration can provide PM2.5 pollution warning for the government and people.This study uses hourly historical meteorological data and historical air quality data from 2016 to 2018 in Jiangxi Province to analyze the spatiotemporal changes of PM2.5 concentration in Jiangxi Province and the relationship between PM2.5 concentration and meteorological elements.Then build the PM2.5 concentration prediction model based on machine learning technology.The learned aims to provide a scientific basis for air pollution management and basic research in Jiangxi Province.The main work and research results of the paper are as follows:First,this study selected data from 17 weather stations and 57 air quality stations in Jiangxi.The data covers 11 districts and cities in Jiangxi Province.After spatiotemporal matching and quality control,two data sets for predicting PM2.5 concentration based on meteorological factors and six data sets for predicting future PM2.5 concentration based on historical data are obtained..Subsequently,spatiotemporal analysis of the PM2.5 concentration data in Jiangxi Province from 2016 to 2018 was carried out.The results showed that the PM2.5 concentration in Jiangxi Province had increased first and then decreased in the past three years.In 2016,the concentration of PM2.5 was 44.38ug/m3 and in 2017,it was 45.99μg/m3,36.7μg/m3 in 2018.PM2.5 pollution in northwestern Jiangxi is more serious,while PM2.5 pollution in the eastern region is at a lower level.Jingdezhen City is the city with the lowest PM2.5 pollution in Jiangxi Province in recent years,and the overall air quality reached Grade II standards in 2018.The correlation study between the PM2.5 concentration and various meteorological elements in Jiangxi Province shows that the average temperature and PM2.5 change in opposite directions.The change trend of the average pressure and PM2.5 are the same.There is no obvious trend relationship between relative humidity and PM2.5 concentration.The changing trend of PM2.5concentration and the change of surface temperature are not obvious,but the two have the same extreme trend.The trends of wind speed changes and PM2.5 concentration changes are as follows:the wind speed is in a low speed state,when the PM2.5 concentration is in a high value state,and the wind speed is in a high speed state,the PM2.5 concentration is in a low value state.Finally,using the pre-processed data,three prediction models of RF,XGBoost,and LightGBM were constructed.Stacking technology was used to fuse the RF,XGBoost,and LightGBM models to obtain a Stacking fusion model,and a prediction comparison experiment was carried out.The experimental results show that:(1)For hourly data sets using PM2.5concentration prediction based on meteorological factors,the R2 of the four models is greater than 0.85,and the prediction accuracy of the RF model and the Stacking model is high.The daily data set performs poorly,and the prediction accuracy of the four models is less than 0.8.Overall,the Stacking model has better prediction results than other models.(2)For the future PM2.5 concentration prediction data set,the prediction accuracy of each model gradually decreases with the delay of the prediction time.In the hourly data set,the 1 hour,6 hour and 24hour data perform better,of which 1 hourly PM2.5 predicts R2 up to 0.97.The daily data set is between 0.63-0.78,which may be related to the lack of data in the daily data set and the lack of time memory.In terms of prediction accuracy:The stacking model has better prediction results than other models.(3)Perform performance analysis on the construction time and memory size of the four models.Taking the hourly and daily data sets that predict PM2.5 concentration based on meteorological factors as an example,the construction time of a single model:RF>>XGBoost>=LightGBM,model memory:RF>>LightGBM>XGBoost.Overall:In a single model,the XGBoost model has certain advantages in predictability,build time,and memory consumption of the model.The Stacking model has some advantages over the single model,but the model construction efficiency and model size are not as good as the XGBoost model. |