Font Size: a A A

Characterizing the behavior of open addressing hash functions

Posted on:2004-02-16Degree:Ph.DType:Thesis
University:The University of New MexicoCandidate:Luo, WenbinFull Text:PDF
GTID:2468390011468479Subject:Engineering
Abstract/Summary:
Hash functions are among the oldest and most widely used data structures in computer science, originating in 1953. In this thesis, first, we propose and design a new family of hash functions, called improved exponential hashing. Improved exponential hashing has the ability to spread the table elements randomly as does exponential double hashing, and also produces full length probe sequences on all the table elements, which avoids insertion and search failures at arbitrarily high table loads. From this point of view, the improved exponential hashing combines the strengths of both linear double hashing and exponential double hashing, and is shown to perform better than both of them. Experimental results are presented to support our claims, which is followed by some theoretic analysis. Second, we study and compare the performance of different hash functions on modern computer architectures. We conduct a theoretic analysis of the problem, which is followed by extensive experimental results to support our conclusions. Third, we propose a technique to improve the performance of separate chaining by combining it with open addressing. Fourth, we derive dynamical system representations for different hash functions and study the spreading speed of clustered data under different hash functions. Then, we use information-theoretic arguments to measure the performance of hash functions. We compare the hash functions based on relative entropy between two probability distributions. One baseline distribution is the uniform distribution, the other one is the distribution created by different hash functions. In general, the better a hash function is, the faster it mixes up a cluster of keys over the whole table space. So, the speed of convergence of the corresponding distribution to the uniform distribution characterizes the behavior of hash functions. Finally, we study hash functions using algorithmic information theory. We use the concepts of algorithmic information and computable information content to characterize hash functions.
Keywords/Search Tags:Hash functions, Open addressing, Improved exponential hashing, Algorithmic information, Exponential double hashing
Related items