Font Size: a A A

Efficient join processing in spatial database systems

Posted on:1997-11-30Degree:Ph.DType:Thesis
University:University of MichiganCandidate:Lo, Ming-LingFull Text:PDF
GTID:2468390014980356Subject:Computer Science
Abstract/Summary:
Spatial databases are being used in an increasing number of application domains. Handling spatial joins efficiently has become a pressing problem, since most relational join methods are not directly applicable in the spatial domain. For example, the sort-merge join method requires a total order on join attributes values, which spatial attributes lack. Hash-join algorithms are efficient for natural joins, but spatial join criteria are too complex to be covered by the simple semantics of natural joins.; Most current spatial join algorithms rely on pre-computed spatial indices, but this approach is inefficient and limited in its usefulness. Existing spatial indices are mostly designed for spatial selections, and are inefficient when used for joins. Further, the operand datasets of spatial joins do not always have pre-computed spatial indices, for example, when they are dynamically generated by other database operations.; This dissertation proposes a new class of methods to support dynamic and efficient spatial join processing. Our work has two goals: first, to obviate the need for pre-computed spatial indices by designing spatial join methods that use dynamically constructed data structures, and second, to promote the efficiency of spatial join processing to the level of relational join processing.; We propose a new spatial index, called the seeded tree, which is effective for spatial joins and efficient to construct at join time. Three principles guide the design of the seeded tree method. First, the construction of a join index should exploit available information about the join, so that the index is tailored for that join. We therefore guide the construction of seeded trees with information extracted either directly from the underlying dataset or an existing index tree. Second, the structure of join indices should be optimized for joins, and free of constraints imposed for supporting selection operations. By relaxing the structure of seeded trees to various degrees, we lower the construction cost and make join processing more efficient. Third, buffer space and I/O should be carefully managed. We develop the technique of batch writes that efficiently reclaims buffer space and accesses disks using sequential I/O. Our performance studies show that our methods run much faster than older methods.; We then extend our spatial join methods, and propose the spatial hash-join approach, which combines the benefits of the relational hash-join paradigm and the seeded tree techniques. We examine the difficulties if applying relational techniques to spatial domains, and define a framework for designing spatial hash joins. Based on this framework, we design and test a spatial hash-join method, that requires no pre-computed indices, and performs better than current methods even when these methods are given pre-computed indices.; This thesis provides very efficient solutions to the hitherto difficult and expensive problem of spatial joins. By not requiring pre-computed indices, it also increases the range of choices available to spatial query optimizers, enabling them to devise more efficient plans for complex queries. The buffer and I/O management techniques developed in this thesis can be applied to reduce I/O cost significantly for both spatial and relational join processing.
Keywords/Search Tags:Spatial, Join, Efficient, I/O, Methods
Related items