| In recent years,air pollution has become increasingly serious and the haze outbreaks frequently,which is maily caused by fine particulate matter(PM2.5)with a diameter of less than 2.5 microns in the air.However,as the reactions of haze exists in a huge atmospheric reactor,and its reaction mechanism involves a variety of spatio-temporal data and reaction types,in which the mass transfer is very complex,so it is difficult to establish the first-principle model to predict the hazel.Meanwhile,an increasing number of air quality and meteorological monitoring stations have accumulated a large amount of haze spatio-temporal data.With the development of machine learning and deep learning,researchers are more likely to apply this method to establish the empirical haze prediction models based on the massive haze data.In the deep learning method,Long Short-Term Memory neural network(LSTM)has become the most effective method for nonlinear prediction method because of its powerful ability to deal with nonlinear data.At the same time,due to the flow of atmosphere,haze problem has complex physical and chemical transmission space characteristics,which is not obtained much attention before.Based on LSTM method,temporal and spatial data of various haze precursors and meteorology in Sichuan province is used and the spatial transmission of haze is fully considered in this thesis.Moreover,spatial econometric analysis(Moran’I index)and wind field type clustering method are innovatively used to study the spatial characteristics of haze prediction.Taking Chengdu Plain as an example,haze pollution prediction and warning are studied by coupling spatial information.The main work of this thesis are as follows:(1)Firstly,a brief introduction is made to the geographical topography and characteristics of the collected data of the studied region.Meanwhile,the data set collected in this region is preprocessed to ensure the integrity of the data.Then,the spatial autocorrelation of PM2.5concentration in Sichuan province is analyzed by calculating the global Moran’I index of PM2.5concentration in prefecture-level cities of Sichuan province and combining the plane data with its geographical relative location.After determining the overall positive spatial correlation of PM2.5in all prefecture-level cities in Sichuan Province,the Pearson correlation coefficient of PM2.5data in Chengdu and surrounding cities is calculated,and 7 surrounding stations are selected as their high correlation with PM2.5 concentration in Chengdu,so as to determine the spatio-temporal modeling regions.(2)In order to make LSTM better coupled spatial data,2D Convolutional Neural Networks(2DCNN)are used to filter the precursor data of Chengdu and reduce the data noise.At the same time,Principal Component Analysis(PCA)is used to reduce dimension of meteorological data(8 groups)and precursor data(7 groups)of surrounding stations in Chengdu to reduce data redundancy.A new 2DCNN-LSTM(PCA)haze prediction model is proposed by coupling the above spatio-temporal data with PM2.5data of Chengdu and surrounding cities through LSTM.In order to make predictions at different time scales,the sliding window method is used to input variables and predict the data at different times in the future.By comparing the proposed results with those of single LSTM,Support Vector Regression(SVR)and Decision Tree(DT)prediction methods,it is seen the advantages of the proposed2DCNN-LSTM(PCA)model through different evaluation indexes.(3)Since the weather type represented by wind sites has an important impact on the atmospheric physical and chemical mass transfer effect of haze,local wind field data is adopted to reflect the spatial characteristics of the model,and the wind field in the research area is studied based on Dynamic Principal Component Analysis(DPCA)’s k-means and hierarchical clustering methods.The wind field is divided into several different wind field types by clustering,and the similar wind field migration conditions at different time points are classified the same type,and the output result is the wind field type label.Finally,a wind field clustering prediction method based on LSTM(Cluster-LSTM)is established by combining PM2.5and precursor data of Chengdu with local wind field data.By comparison results with SVR and Bayesian Linear Regression(BLR),the results show that the proposed method has a good prediction effect on the autumn and winter monsoon data.Compared with the proposed 2DCNN-LSTM(PCA)model,the proposed Cluster-LSTM uses less data to describe the spatial migration,thus achieving the purpose of low-dimensional data spatio-temporal modeling. |