Font Size: a A A

Research On Key Technologies Of Spatial Data Mining

Posted on:2005-10-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z B ZhangFull Text:PDF
GTID:1118360152969118Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Spatial data mining is defined as the nontrivial process of extracting the hidden,identifying valid, novel, potentially useful and ultimately understandable patterns orknowledge from spatial databases. The applications of the spatial data mining widelyinvolve in national economy and national defense such as in GIS, meteorology, remotesensing, communication control, city planning, environment investigation, geo-economicsand military game evaluation etc. Therefore, spatial data mining is a field possessing apromising prospect, and it is a hotspot of current studies. Moreover, clustering andclassification algorithms based on AI give spatial mining some strong supports. It is under such background that the author effectively studies the corresponding keytechnology on the spatial data mining. The improved spatial clustering algorithm namedIDENCLUE has been proposed from DENCLUE (DENsity-based CLUstEring) because ofits demerits. The spatial data classification algorithm named IRBFNN based on improvedradial base function neural network has been presented. The author studies similarity joinand k-nearest neighbor query strongly related to spatial data mining. The author studies theprototype system of spatial data mining named DMSDM. Clustering is important technique of human cognitive activity. Clustering analysis methoddivides the data into some groups according to a certain distance or similarity measure, andfind out distributing, structure and pattern of the data. The similarity measure of theclustering objects heavily affects the cluttering result, so the author studies the problemsabout similarity measure, and proposes a generalized distance. Clustering technique hasbecome a very important spatial data mining method because it is an unsupervisedclassifying method and it requires the data analyst with little knowledge of the certaindomain. Clustering based on density become a common effective one among the Clusteringmethods, it groups data based on the difference of density. The algorithm DENCLUE is aclustering method based on generalized kernel density estimate, it has so many merits such IIIas it supports very large volume of data, arbitrary shape cluster, and the high dimension data,it is not sensitivity of noise, and it can discover the hierarchy of the data cluster. But it hassome demerits, the parameters of the algorithm are choose by ones experience, thus it isdifficulty to use, however the parameters of the algorithm heavily affects the clusteringresult, moreover it is not very efficiency owing to not making good use of the high densitydata grid. Therefore, the author proposes improved algorithm IDENCLUE, which optimizesthe algorithm parameters using density entropy. IDENCLUE reduces the algorithmcomplexity and speeds up the algorithm remarkably by giving a cluster tag to the data in thehigh density grid before its running, by making use of the relationship of the average densityand the density, and by computing using one datum instead of some certain data gridswithout losing the precision. IDENCLUE effectively overcomes the demerits of DENCLUE.The experiments show that the improved algorithm IDENCLUE is more efficient thanDENCLUE. Classification is very important method in data mining. Neural network is widely used inpattern recognition, signal processing, system identification, etc. and it successfully solvessome classifying problems in a lot of fields of application, because of it with abilities oflearning, self-organizing, function approximating and massively parallel computing.Therefore, neural network adapts to spatial data classification very good. However, neuralnetwork has been criticized that it is a "black box", inscrutable, unable to explain itsoperation or how it arrives at a certain decision owing to it stores its decision mechanisminto the connecting weights distributive. So extracting rules from the neural network is asignificant work. A spatial data cl...
Keywords/Search Tags:spatial data mining, clustering, classification, spatial query optimization, neural network
PDF Full Text Request
Related items