Font Size: a A A

An Efficient Algorithm For The Two-phase Bloom Filter

Posted on:2013-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:F LiFull Text:PDF
GTID:2268330425484155Subject:Computer technology
Abstract/Summary:PDF Full Text Request
A Bloom Filter is a space-efficient data structure which performs membershipqueries over sets with allowable errors. It has been widely used in distributed systems.The existing algorithms have been improved by extending the standard Bloom filterinto various application fields, but these algorithms are not able to reduce the falseprobability in membership queries.To reduce the false probability of Bloom filter, this paper proposed a newTwo-Phase Bloom Filter algorithm (TPBF) which uses standard Bloom filter in twophases. In the first phase, a standard Bloom filter is used for preliminary filtering togenerate a member set and a set of false-positives. In the second phase, anotherstandard Bloom filter is used to further filter the false-positives generated in the firstphase. This paper also derived the optimal parameter design for TPBF algorithmbased on theoretical calculation.Thorough performance analysis for TPBF algorithm was performed fromperspective of theory and simulation in this paper. First, false probability and spacecost were compared between TPBF and standard Bloom filter based on theoreticalcalculation. Second, statistics of hash function calculation was analyzed for bothTPBF and standard Bloom filter by comparing their query time and based on thestatistics the ratio of average query time of TPBF to that of standard Bloom filter wasestimated. This was verified by simulation later in the paper. Theoretical analysis andsimulation experiment results showed that false probability of TPBF algorithmdecreased sharply with each element bit. When each element used10bits, the falseprobability of TPBF algorithm was four orders of magnitude lower than the standardBloom filter and the average query time for TPBF algorithm was lower than or equalto the standard Bloom filter. In addition, zero false incidents in the TPBF simulationwere analyzed and the relationship between the theoretical average number of falseand the probability to produce zero false in TPBF algorithm was found. Thetheoretical analysis and simulation results showed that the application design can besimplified by specifying the high probability to produce zero false in TPBFmembership query test.
Keywords/Search Tags:Bloom filter, membership query, probability of false, number of false, load factor, zero false
PDF Full Text Request
Related items