Font Size: a A A

Research On PM2.5 Concentration In Zhengzhou City Based On Stacking Fusion Model

Posted on:2023-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WangFull Text:PDF
GTID:2531306623979089Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
After the 1980s,with the rapid development of China’s economy,the problem of air pollution is becoming increasingly prominent.When the PM2.5pollution is severe,the visibility is continuously reduced,which has a great impact on people’s daily life and travel.Therefore,it is of great significance to investigate the relationship between PM2.5concentration and historical data,other atmospheric pollutants and meteorological factors.Based on the daily PM2.5concentration and other pollutants such as PM10,SO2,NO2,CO,O3and meteorological factors such as average air temperature,average air pressure,mean wind speed,mean relative humidity data from January 1,2014 to December 31,2021 in Zhengzhou City,Henan Province,time series single model and machine learning single model are applied respectively,and Stacking fusion model is built based on the single model in the thesis.The performance of all the PM2.5prediction models established is compared and analysed according to the goodness of fit(R2),root mean square error(RMSE)and mean absolute error(MAE).The main results of the thesis are as follows.Firstly,the variables selected for study in the thesis are based on the correlation of PM2.5concentrations with other pollutants and meteorological factors.It builds ARIMA prediction models in stages on the basis of historical data,and Random Forest(RF),Light GBM,Support Vector Regression(SVR)and Adaboost prediction models on the basis of other pollutants and meteorological factors respectively.The predictive performance of the single models varies considerably,with the best fit being the Support Vector Regression and the worst fit being the ARIMA model.Secondly,we integrate four machine learning single models into multiple fusion models using Stacking in the thesis.The predictive performance of the models differs little and is better overall,with the best fit being SVR-Adaboost—LR and the worst fit being RF-Adaboost—LR.At the end,the results of PM2.5concentration prediction are compared and analysed by combining machine learning and integrated learning.For all the models built in the thesis,the model that meets the best fitting performance is SVR-Adaboost—LR,which is the most suitable model for conducting PM2.5concentration prediction in Zhengzhou.In addition,the prediction performance of the integrated learning is significantly improved compared to other machine learning single models.It is not the case that the more complex the type of primary individual learner and the greater the number of learners,the better the final Stacking integration will be.Differences in the types of primary learners and the choice of secondary learners also need to be taken into account.
Keywords/Search Tags:PM2.5 concentration, ARIMA, Machine learning, Stacking model
PDF Full Text Request
Related items