Font Size: a A A

The Research On High Performance Content Filtering And Distribution Technologies

Posted on:2010-08-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:K HuangFull Text:PDF
GTID:1118330338982116Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As link rates and traffic volumes are constantly growing, the packet contentfiltering technology faces high performance challenges. This means that Deep PacketInspection (DPI) is required to satisfy both line-speed packet processing and smallmemory requirements. As network size continues to grow with increasing dynamicand heterogeneous nodes, it is very challenging for the P2P content distributiontechnology to achieve efficient, robust, and scalable content dissemination acrosslarge-scale networks. In this dissertation, we focus on high performance packetcontent filtering and P2P content distribution technologies. The former involvestradeoffs between time and space complexities of DPI, while the latter involvestradeoffs among efficiency, fairness, and load balance of content distribution. Themajor contributions of this work are as follows.(1)Our packet content filtering technology includes hardware-based packetpreprocessing techniques and signature matching algorithms:The Fast Hash Table suffers from both high update overhead and large off-chipmemory requirements. We propose a Double-counter Bloom filtered Hash Table(DBHT). The Double-Counter Bloom Filter uses insertion and deletion counters perbucket for recording the number of inserted and deleted items. Experimental resultsdemonstrate that the DBHT is a time/space-efficient hash table, which significantlyreduces the off-chip memory accesses as well as the off-chip memory requirements, atthe cost of additional acceptable on-chip memory space.The Trie Bitmap Content Analyzer suffers from high update overhead and manyfalse positive memory accesses, while the Shared-node Fast Hash Table suffers fromboth high update overhead and large off-chip memory requirements. We propose anIndex-Split Bloom Filter (ISBF). The core idea of ISBF is that an index of off-chipitem is split apart into a group of bits, and each group uses several parallel CountingBloom Filters in on-chip memory to represent a set of items. Moreover, both lazedeletion algorithm and vacant insertion algorithm are proposed to reduce the updateoverhead of ISBF. Experimental results demonstrate that the ISBF achieves fast andmemory-efficient lookups, which significantly reduces the off-chip memory accessesand processing times as well as the on-chip and off-chip memory requirements.The bit-split string matching algorithm suffers from the unnecessary state transitions problem. We propose a byte-filtered string matching algorithm, whereBloom filters are used to preprocess every incoming character before performingbit-split string matching. Experimental results show that the byte-filtered algorithmenormously reduces the string matching times as well as the number of statetransitions.The Extended Finite Automaton (XFA) augments DFA with auxiliary statevariables and simple instructions for manipulating them, achieving significantmemory reduction in the aspect of states. We propose a Compact Finite Automaton(CFA) based regular expression matching algorithm, in order to further reducememory requirements in the aspect of transitions. The CFA exploits a priority-basedtransition compression scheme to compress multiple transitions with the same mostdestination state on an input label, and a bitmap-based transition lookup scheme tosearch transition subsets with differenet priority in parallel. Experimental resultsshow that the CFA significantly reduces both the number of transitions and thememory requirements, while almost keeping the same matching times with the XFA.(2)Our P2P content distribution technology includes BitTorrent performanceoptimization techniques and DHT-based balanced broadcast algorithms:BitTorrent suffers from a paradox of supply and demand between upload peersand request peers. We propose a dynamic quota-based peer selection strategy, wheredue to the principle of investment return, each upload peer adaptively allocates uploadquotas for the TFT and OU algorithms. Experimental results show that the dynamicquota scheme is efficiency and fairness tradeoffs, which sacrifices a part of uploadfairness to improve download efficiency of BitTorrent.We analyze the impact of selfish peer's free riding on the performance,fairness,and robustness of BitTorrent. We propose an activeness-based seed choking algorithm,where the activeness refers to the ratio of available download bandwidth to availableupload bandwidth of request peers, and a seed preferentially uploads request peerswith the highest activeness values. Experimental results show that the activenessalgorithm not only restrains the free riding of selfish peers, but also improves theperformance of benign peers, thus enhancing BitTorrent's robustness.Exiting DHT-based broadcast algorithms exploit a greedy finger routingalgorithm of DHT to construct a Distributed Broadcast Tree (DBT), leading to thelimitations of scalability and load balancing. We propose two DHT-based lightweightbroadcast algorithms. When nodes are uniformly distributed in the identifier space, atoken-based broadcast algorithm is proposed, where each node selects the finger nodes as its children by a token value. When nodes are arbitrarily distributed in theidentifier space, a partition-based broadcast algorithm is proposed, where each nodehierarchically partitions its identifier space into two subspaces and selects the agentnodes in the subspaces as its children. Experimental results demonstrate that both thetoken-based and partition-based algorithms can construct and maintain a scalable andload-balanced DBT, where the branching factor are at most two, and the tree heightis O (log n )in a Chord ofnnodes, without extra state space and explicit maintenanceoverhead.
Keywords/Search Tags:deep packet inspection, scalable content distribution, regular expressionmatching, index-split bloom filter, compact finite automaton
PDF Full Text Request
Related items