Font Size: a A A

Research On Spatial Clustering Algorithms

Posted on:2008-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:L TaoFull Text:PDF
GTID:2178360215950919Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Advances in information technologies have led to the continual collection and rapid accumulation of data in repositories. Spatial data mining, or knowledge discovery in spatial databases, refers to extract implicit regularities, rules or patterns hidden in large spatial databases. Finding clusters in spatial data is an active research area in spatial data mining.The first part of this thesis proposes a novel density-based spatial clustering method with heuristically selecting border object called DBSB. The algorithm fast expands the clusters by a heuristic function to choose core objects in the border region of the known core object, and then merges some clusters by border objects. That is, the DBSB algorithm gets the ultimate clustering result through two steps of clustering. The theoretical analysis and experimental results indicate that the algorithm is effective and efficient.The continuous developments in computing and communication technology over wired and wireless networks have recently led to many pervasive distributed computing environments, which comprise several, and different sources of large volumes of data and several computing units connected to each other via local or wide area networks. The second part of this thesis presents a reverse k nearest neighbor (RkNN) based distributed clustering algorithm, called DCRkNN which consists of three different steps. (1) Determination of a local model which can reduce the size of dataset of local node; (2) Determination of a global model which is based on all local model; (3) Updating of all local model. DCRkNN utilizes the state-of-the-art property of the RkNN and employs decentralized data mining framework. DCRkNN can easily extend to the algorithm of distributed outlier detection and protect raw private, sensitive data at different nodes to a certain extent. Experimental evaluation indicates that DCRkNN approach brings high quality clusters in the data residing at different nodes of network with scalable transmission cost.
Keywords/Search Tags:Distributed Data Mining, Density-based Clustering, Spatial Clustering, Outlier Detection, reverse k-nearest neighbor
PDF Full Text Request
Related items