Font Size: a A A

Upper and Lower Bounds for Sorting and Searching in External Memory

Posted on:2015-06-04Degree:Ph.DType:Dissertation
University:State University of New York at Stony BrookCandidate:Medjedovic, DzejlaFull Text:PDF
GTID:1478390017499141Subject:Computer Science
Abstract/Summary:
This dissertation presents variants on sorting and searching in external memory.;In the first part of the dissertation, we derive lower and upper bounds on sorting with different-sized records. We show that the record size substantially affects the sorting complexity, and so does the final interleaving of the smaller and larger records in the final sorted sequence: sorting costs more when large and small records are segregated than when they are interleaved in the final sorted order.;In the second part of the dissertation, we study the batched predecessor problem in external memory. Given the underlying sorted set S of size n, and a sorted query Q of size x ≤ nc, 0 ≤ c < 1, we study tradeoffs between the searching cost, and the cost to preprocess S. We give lower bounds in three external memory models: the I/O comparison model, I/O pointer-machine model, and the indexability model. Our results show that in the I/O comparison model, the batched predecessor problem needs O(logBn + 1/B) I/Os per element if the preprocessing is polynomial; with exponential preprocessing, the problem can be solved faster, in theta((log2n + 1)/B).;In the third part of the dissertation, we introduce alternatives to the well-known Bloom filter data structure. The quotient filter is designed for RAM, but with better data locality than the Bloom filter. The buffered quotient filter and the cascade filter are SSD-optimized alternatives to the Bloom filter. In experiments, the cascade filter and buffered quotient filter performed insertions 8.6-11 times faster than the fastest Bloom filter variant and performed lookups 0.94-2.56 times faster.
Keywords/Search Tags:External memory, Sorting, Lower, Filter, Searching, Bounds, Dissertation
Related items