Font Size: a A A

Query processing in spatial database systems: Declustering and clustering techniques

Posted on:1998-08-27Degree:Ph.DType:Thesis
University:University of MinnesotaCandidate:Ravada, SivakumarFull Text:PDF
GTID:2468390014974641Subject:Computer Science
Abstract/Summary:
The research question in this thesis concerns how to parallelize the spatial range and join query processing in order to support a high performance spatial database application. Data partitioning for the range query operation involves declustering of spatial data, while data partitioning for the spatial join involves clustering of spatial data. If the static partitioning methods fail to equally distribute the load among different processors, the load-balance may be improved by redistributing parts of the data to idle processors using Dynamic Load-Balancing (DLB) techniques.; In this thesis, we provide a framework for declustering collections of extended spatial objects by identifying the following key issues: (i) the work-load metric, (ii) the spatial-extent of the work-load, (iii) the distribution of the work-load over the spatial-extent, and (iv) the declustering method. We identify and experimentally evaluate alternatives for each of these issues.; In addition, we also provide a framework for dynamically balancing the load between different processors. We experimentally evaluate the proposed declustering and load-balancing methods on a distributed memory MIMD machine (Cray T3D) and shared-memory machine (SGI Challenge). Experimental results show that the spatial-extent and the work-load metric are important issues in developing a declustering method. Experiments also show that the replication of data is usually needed to facilitate dynamic load-balancing, as the cost of local processing is often less than the cost of data transfer for extended spatial objects. In addition, we also show that the effectiveness of dynamic load-balancing techniques can be improved by using declustering methods to determine the subsets of spatial objects to be transferred during run-time.; A spatial join is often performed in two steps: a filter step and a refinement step. In this thesis, we focus on the refinement step of the spatial join. The refinement step of the spatial join takes as input a sequence of pairs of tuples and checks each tuple to see if the join predicate is satisfied for that tuple. This is similar to the join index processing done in traditional relational databases. We develop min-cut graph partitioning based methods for join processing using a join index. We use min-cut graph partitioning as a new heuristic for solving the page access sequence problem for fixed size buffer in sequential systems. We show that the number of page accesses needed to compute a join using join index in a fixed buffer environment is bounded by the sum of sizes of the base relations and the size of the cut-set of the page connectivity graph. Since the min-cut graph partitioning aims to minimize the size of the cut-set, this proposed heuristic is a direct method. Experiments with benchmark data sets show that the graph-partitioning based heuristic outperforms the existing methods, particularly when join selectivity is high and buffer space is small. (Abstract shortened by UMI.)...
Keywords/Search Tags:Spatial, Join, Processing, Data, Declustering, Query, Min-cut graph partitioning, Methods
Related items