Font Size: a A A

Research On Xgboost Algorithm Of The Haze Predicition Model

Posted on:2019-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z B MaFull Text:PDF
GTID:2370330605975335Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Most of the haze research in our country started after the Pm2.5 data released by the U.S.Embassy in 2011.In recent years,haze has spread radially in North China.China's Yangtze River Delta region also belongs to haze heavily polluted areas.As long as the air heating pressure is reduced,there will be a large area of haze phenomenon.The application of data mining algorithm to forecast the haze has been an effective way to reduce the negative influence of haze.1.In this paper,descriptive analysis and selection of the factors affecting the concentration of Pm2.5 were carried out,including principal component analysis and random forest ranking method-Finally,the five most influential variables were selected to establish the prediction model,which consisted of wind speed,wind direction,pressure,humidity and temperature.2.By cross validation,five commonly used models were compared in this paper.They were multivariate regression,logistic regression,time series,Adaboost algorithm and Xgboost algorithm.According to the overall forecasting effect and stability,Xgboost algorithm was chosen to establish the forecast model.3.The objective function and processing of Xgboost algorithm were described in detail.When the Xgboost algorithm was applied to establish the model,the training data were processed by outlier and feature discretization.By adjusting the objective function,the predicted AUC of the model was increased by 0.05,which improved the accuracy of the classification.And the convergence of algorithm was analyzed.The accurate forecast of haze was accomplished by using the improved Xgboost algorithm model.
Keywords/Search Tags:Haze, Regression prediction, Time series prediction, Boosting algorithm, Xgboost algorithm
PDF Full Text Request
Related items