Font Size: a A A

Incremental algorithms for proximity queries

Posted on:2001-11-24Degree:Ph.DType:Thesis
University:University of Maryland College ParkCandidate:Hjaltason, Gisli RunarFull Text:PDF
GTID:2468390014458429Subject:Computer Science
Abstract/Summary:
Proximity queries, in one form or another, are among the most common types of queries in modern database applications, whether it be for spatial data, multimedia data, or other complex data types. The nearest neighbor query seeks the closest (or several closest) objects to a query object in a collection of objects; for example, “find the city closest to the Grand Canyon”. More generally, the “query object” may be a collection of objects; for example, “find the store closest to a warehouse”. Such a query is termed a distance join query.; In this thesis the main focus is on incremental evaluation of proximity queries, where the result of the query is reported one object (or pair of objects) at a time. In other words, the number of desired result objects needs not be specified in advance. This is in contrast to traditional approaches, where the number of result objects must be fixed in advance (e.g., as in the k-nearest neighbor query). Thus, if more result objects: are eventually deemed necessary, the traditional approaches require that the query be recomputed from scratch. In contrast, incremental evaluation algorithms allow efficient pipelined execution of complex queries involving proximity.; Incremental algorithms for nearest neighbor and distance join queries for spatial data are presented, and extensions of the incremental nearest neighbor algorithm for similarity search in metric data are described. The incremental nearest neighbor algorithm is conceptually simple and general enough so that it works for a large class of data structures. When applied on hierarchical spatial data structures, the algorithm achieves an optimal number of node accesses, regardless of how many neighbors are requested. In principle, the incremental distance join algorithm shares these characteristics. However, due to the complexity of distance join queries, numerous challenges must be overcome to achieve efficient query evaluation.; Two additional topics related to proximity algorithms are covered in this thesis. First, we present algorithms for bulk-loading quadtree-based spatial data structures, i.e., for building such structures from scratch for some data set. Fast bulk-loading algorithms are important as they allow efficient processing of a complex spatial operation (such as a distance join) on unindexed data sets or temporary query results by first building a spatial index and then using it to compute the operation. Second, we consider several general embedding methods for metric data and examine the issue of whether or not they result in contractive embeddings. We show that contractive embeddings allow similarity search (e.g., with our incremental nearest neighbor algorithm) while ensuring that no relevant answers are left out.
Keywords/Search Tags:Incremental, Queries, Proximity, Data, Distance join, Query
Related items