Research Of Key Techniques On Spatial Data Mining Based On Spatial Autocorrelation

Posted on:2008-02-27

Degree:Doctor

Type:Dissertation

Country:China

Candidate:C P Hu

Full Text:PDF

GTID:1118360272476798

Subject:Computer application technology

Abstract/Summary:

The quick development of computer technologies, network technologies, spatial data collection technologies and spatial databases technologies make spatial data more complex, more changeable and bigger, which has been beyond the human ability to analyze, so the demand of discovering knowledge from spatial databases is strengthened increasingly and a new research field in order to discover knowledge from spatial databases has appearedâ€”â€”spatial data mining. Spatial data mining refers to the extraction from spatial databases of implicit knowledge, spatial relations or significative features or patterns that are not explicitly stored in spatial databases. It is a new area that integrates multi-subject and many technologies, which combines the technologies of data mining, machine learning, pattern recognition, spatial databases, statistics, artificial intelligence, geographic information system, remote sensing and decision support system and so on.This paper firstly introduces the basic theory of spatial data mining systemically, compares the differences between the traditional data mining and spatial data mining. Due to the characteristic of spatial data, the traditional data mining technologies are unfit for mining knowledge from spatial databases. In order to mine novelty, effective and understandable knowledge from spatial databases, new theories, technologies and methods must be studied. The research of this dissertation focuses on spatial clustering, spatial co-location rules and spatial classification and prediction.The main contributions of this paper can be included as follows:Firstly, an improved density-based spatial clustering algorithm with sampling (IDBSCAS) based on DBSCAN is proposed, which not only clusters large-scale spatial databases effectively, but also considers spatial attributes and non-spatial attributes. Firstly, because this algorithm adopts a new sampling technique, it needn't execute region query for all objects in a purity-core object's neighborhood, saving a lot of clustering time. In addition, it considers not only spatial attributes but also non-spatial attributes by introducing the concept of the matching neighborhood, which improves the clustering quality. Experimental results of 2-D spatial datasets show that IDBSCAS is better than DBSCAN on the efficiency and the quality of clustering. Secondly, although there have some researching on spatial co-location rules mining, but mostly researchers discuss only positive spatial co-location rules, don't consider negative spatial co-location rules. A novel positive and negative spatial co-location rules mining algorithm(PNSCLRMA) is proposed, which mines not only positive spatial co-location rules but also negative spatial co-location rules. To reduce the computational cost, the algorithm uses two optimization techniques of adopting star neighborhoods to reduce join operations and defining the interesting degree to delete uninteresting spatial co-location patterns. Experimental results show that the algorithm can efficiently mines positive and negative spatial co-location rules.Thirdly, a new spatial prediction model(MLR*) based on the multivariate linear regression (MLR) model is proposed, which spatial information is firstly added into inputting variables by replacing each inputting variables with the weighted average of its neighbors and feed the new inputting variables to a MLR model to estimate model parameters, and then make spatial prediction. Experimental results show that the MLR* model and the spatial auto-regression(SAR) model have almost identical effects on spatial prediction, while the MLR* model is computationally more efficient than the SAR model.Finally, the spatial classification and prediction algorithm based on fuzzy c-means(SFCM) is proposed by introducing the concept of fuzzy membership degree of a spatial object to a fuzzy cluster. Firstly, this algorithm clusters the dataset by fuzzy c-means. Due to spatial autocorrelation of spatial data, spatial information must be added into the fuzzy c-means algorithm for spatial clustering. Secondly, it computers the fuzzy membership degree of each spatial object to all fuzzy clusters and finds the cluster that its fuzzy membership degree is the maximal. At last, the dependent variable value of the spatial object is estimated by the dependent variable value of the mean object of this cluster. Theoretic analysis and experimental results show that the algorithm outperforms the SAR model and the CPFCM method on the classification and prediction accuracy, and is faster than the SAR model.

Keywords/Search Tags:

spatial data mining, spatial autocorrelation, spatial clustering, spatial co-location rules, spatial classification and prediction

Related items

1	A Research On Spatial Data Mining
2	Study On Techniques Of Spatial Database Based-data Mining
3	Design And Implementation Of Spatial Data Mining System (M-SDM) Based On MATLAB
4	Research On Spatial Co-location Pattern Mining Based On Spatial Compression Cliques
5	Study On Techniques Of Spatial Data Mining
6	Spatial Database Oriented Application Research On Spatial Data Mining
7	Research On Spatial Index Based On QAAR-Tree
8	Spatial Data Mining Classification Method And Its Application
9	Research And Application Of LBS Spatial Data Clustering Based On MapReduce
10	Research On Key Technologies Of Spatial Data Mining