Font Size: a A A

The Study Of Statistical Methods Of Spatial Autocorrelation For Massive Spatial Areal Data

Posted on:2020-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LuoFull Text:PDF
GTID:1480305882989239Subject:Cartography and Geographic Information Engineering
Abstract/Summary:PDF Full Text Request
Spatial autocorrelation(SA)quantifies the degree of autocorrelation for some geographical variable distributed in the study area,it is an important index for spatial data analysis,and goes through all the stages of spatial data analysis.Because of the rapid development of internet and sensors techonology,accessing spatial data which has a large proportion with very large sample sizes is getting much esier than the early days.It is very necessary to generalize the properties of SA from small-to-medium sample sizes(e.g.,within the magnitude of 10~3)to big-to-massive sample sizes.And problems(e.g.,the p-value problem)coming with the big sample sizes also need new solutions.This research focuses on massive(referring to spatial data with big sample sizes in this paper)spatial areal data,and discusses problems relating to spatial autocorrelation.These problems include:1)How to select a proper spatial autocorrelation statistic for some circumstance that has specific spatial configuration and probability distribution of the interested variable;2)How to set a proper null hypothesis for the spatial autocorrelation in the stage of statistical inference,and what is the statistical distribution of the test statistic under the nonzero null hypothesis;3)How to deal with the widely discussed p-value problem in the spatial statistical background.The first point can be regarded as exploring the big sample properties of“descriptive statistics”in spatial statistics.The second and the third points relate to statistical inference,with the former focusing on setting proper null hypothesis so that it matches the empirical data better,and the latter contributing to introduce an existing but rarely used method to solve the p-value problem.This research conducted the following works pertaining to the three problems.1)The first part answers the first question by means of the asymptotic variances of two classical spatial statistics–the Moran Coefficient(MC)and the Geary Ratio(GR).Conclusions include:·The asymptotic variance of the MC is more efficient and stable than the asymptotic variance of the GR,and the former is insensitive to the probability distribution of the interested variable,whereas the latter is impacted by the kurtosis of the distribution of the variable;·Except some theoretical settings,the exact variances of the two statistics are of the same efficient,their asymptotic variances can achieve decent accuracy with respect to the exact ones within 10~2 sample sizes,which indicates an improvement for calculating speed by employing the asymptotic versions;·For majority spatial configurations,the asymptotic variance of the MC requires less sample size to achieve a preset accuracy;·The priority of the statistical power of the MC versus the GR disappears when sample size goes to large.2)The second part employs the simultaneous autoregressive(SAR)model which is the most widely used in spatial statistics,and furnishes the statistical distribution of the nonzero spatial autocorrelation parameterin the model.Here are some specific points:·The null hypothesis of zero spatial autocorrelation is not appropriate for empirical spatial data which always is autocorrelated,so this part sets the null hypothesis for medium and strong spatial autocorrelation for social demographical/economic data and remotely sensed images;·The Fisher's Z-transformation and its generalized form cannot stabilize the variance of,which varies with the sample size and degree of spatial autocorrelation,requiring a new method to express the varicance;·The relationship function between variance ofandis given by an expression similar to a beta distribution with equal parameters(>1);·Hypothesis testing with nonzero null hypothesis for two empirical data sets are implemeted.3)The last part introduces the relevant differences test to deal with the p-value problem occurring in the spatial statistical background,this part also furnishes a scheme to determine the irrelevant limit which is critical for the relevant difference test.The main works consist of:·It discusses that the irrelevant limit is a sort of effect size which is a method for dealing with the p-value problem in the literature;·The irrelevant limits for the MC andacross their feasible ranges are determined;·The relevant differences test is implemented in the spatial statistical circumstance,and convinced results are obtained.These works are helpful for conducting massive spatial areal data analysis,they furnish theoretical as well as methodological references for selecting spatial autocorrelation statistics and hypothesis testing related problems(i.e.,setting suitable null hypothesis for practical data,and solving the p-value problem)in the process of spatial data analysis.
Keywords/Search Tags:spatial autocorrelation, massive spatial areal data, asymptotic variance, statistical distribution for nonzero spatial autocorrelation parameter, relevant differences test
PDF Full Text Request
Related items