Font Size: a A A

Urban PM2.5 Concentration Prediction Based On Parallel Random Forest

Posted on:2019-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:C R RenFull Text:PDF
GTID:2321330569479976Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of our national economy,people’s material and cultural standards of living has been improved.However,the imbalance between environment and development has been increasingly emerging at the same time.The environment has been greatly damaged,mainly reflecting in the air,water and soil,all of which are essential for the survival of human,animals and plants.In recent years,haze and many other air pollutions have been occurred frequently,leading to negative effects on human’s normal production,living,work and study,being harmful to people’s physical and psychological health virtually,and impeding social sustainable development.Scientific and effective forecasts should be made toward PM2.5,the major cause of haze,to let people do protection work in advance,reducing the damage to human as much as possible.Taiyuan,as one of the typical city for energy and chemical industry,has been perplexed by air pollution for long.Thus it is of practical significance to study the concentration predicting of PM2.5 in Taiyuan.This paper will discuss the research from the following three aspects:First,this paper is based on the data of air quality monitoring and contemporaneous surface meteorological data from 1st January to 31st December,2017 in Taiyuan.The paper will do data mining and the analysis of change rules of the concentration of PM2.5 from month,week,day and other time scales.Besides,analysis will also be made on the relevance between the concentration of PM2.5 and other air pollutants such as PM10,SO2,NO2,CO,and O3.What’s more,the paper also analyzes the effects of meteorological conditions,for example,temperature,humidity,wind direction and wind speed,on the concentration diffusion of PM2.5.At last,the research will be made in spatial-temporal correlation of the concentration of PM2.5 between predicted sites and its surrounding sites.Second,the author establishes different prediction models toward different clusters through K-Means in the process of data preprocessing.Making use of the relevance of pollutants,the author uses random forests to establish a model to make up for the missing value of PM2.5.Then the author reduces or eliminates the class imbalance’s adverse impacts on the forecasting model through undersampling.Finally,the author establishes the forecasting model of concentration and its grades through random forests on the basis of the Spark platform.The model is built with time factors,meteorological conditions and the relevance of sites as its features,and the model’s forecasting result is evaluated.The result shows that the forecasting method the paper proposes has high prediction accuracy to the concentration of PM2.5 in Taiyuan.
Keywords/Search Tags:PM2.5, random forest, Taiyuan, forecast, data mining
PDF Full Text Request
Related items