Font Size: a A A

Prediction And Clustering Of Air Quality In China From The View Of Functional Data

Posted on:2019-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:H R LiFull Text:PDF
GTID:2370330545470165Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Functional data is a kind of complex data.Each element of functional data is a function.These functions are defined on time,but may also be spatial location,etc.The functional data analysis is favorecd by many statisticians,and the fitting,forecasting and other statistical analysis of the functional data are carried out.In this paper,the visibility data in China are predicted from the view of the functional data.The data of air quality in many cities in China are analyzed by functional clustering.The main work of this paper is as follows:The first chapter mainly describes the functional data and its research status at home and abroad.The application and the shortcomings of Gaussian process function regression model and mixed effect model to functional data and functional clustering analysis methods are introduced in this paper.In addition,generalised Gaussian process functional regression model and functional clustering methods for sparse data are introduced.For the second chapter,we model based on generalized Gaussian process func-tional regression with mixed-effects.According to the not unified visibility data in recent 50 years in China,we use generalized Gaussian process functional regression model with mixed-effects to estimate the parameters of the model.Then we predict the visibility of Beijing with the selected 20 represents cities' visibility during nearly fifty years.Example and simulation results show that the model not only describe the complex relationship between the research object and influence factors well and can greatly improve the prediction efficiency.In the third chapter,we focus on researching curve clustering problems.In this chapter,we cluster the batches not just considering the shapes of the response vari-able,and based on the relationships between the response variable and some input covariates.The air quality data of 161 cities in China are analyzed.Based on the concentration of the main pollutant,PM2.5,which mainly affects air quality in Chi-na,the improved functional data clustering method is used.Human factors such as industrial smoke(dust)emission are added as auxiliary clustering variables.And the variance is different in different categories.By adjusting the penalty parameters,the concentration curve of PM2.5 is fitted better.The clustering accuracy of several clustering methods is compared in simulation.The results show that the function-al clustering method described in this chapter has the highest clustering accuracy.After clustering,we find that the seasonal variation of concentration of PM2.5 in d-ifferent categories is different.Therefore,each city should formulate corresponding environmental treatment measures according to its own unique economic situation and climatic conditions.To sum up,functional data analysis has important practical significance in en-vironmental governance in China.Simulations and real data analysis show that it can realize the effective prediction of visibility data using Gaussian process functional regression model with mixed effects when the response variable is non-Gaussian func-tional data such as attribute data and the covariates are functional variables.And the correct rate can be improved by adjusting clustering methods according to the char-acteristics of data,such as adding prior information,considering heteroscedasticity in different categories and the punishment of base coefficients.
Keywords/Search Tags:Functional data, Generalized Gaussian process regression, Mixed-effects, Clustering analysis, Air quality
PDF Full Text Request
Related items