Font Size: a A A

Research On Several Fundamental Problems Of DHT Overlay

Posted on:2010-04-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X W NieFull Text:PDF
GTID:1118360275480049Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Distributed hash table (DHT) is a new Peer-to-Peer (P2P) networking pattern,which has many advantages, such as efficiency, scalability and low cost, and has beenwidely deployed in various large-scaled distributed systems, ranging from file sharing,contents distribution, streaming media to VoIP. In such systems, the overlay formed withDHT plays an important role in supporting the upper layer applications. As a result, ithas drawn the attention of many researchers, and becomes the current research hotspot.This dissertation mainly focuses on some fundamental problems to be solved in theapplication of DHT. Starting from analyzing the basic model of DHT network, thedissertation investigates the zone balance in DHT overlay, explores how to resist theSybil attacks deeply, models the problem of estimating network size, and proposes anefficient random nodes sampling algorithm of DHT overlay.The major contributions of this dissertation are as below:1. The dissertation investigates the problem of dividing a line randomly, anddeduces two basic conclusions: in Chord network, the interval between nodes obeys theexponential distribution asymptotically, and the number of nodes appearing in a sectionof address space obeys the Poisson distribution asymptotically. The analysis andsimulation shows that the asymptotic accuracy of these two basic conclusions is veryhigh, and the error is very little. The conclusions are applicable not only to Chord, butalso to other types of DHT which satisfies the assumed condition that the identifiers aredistributed uniformly in the address space. They are the basis of analysis and modelingof the total dissertation.2. The dissertation proposes the zone balancing scheme based on static replica, anddeduces the zone load distribution of nodes in Chord and Pastry network, the virtualservers (VS) and the static-replica-based balancing scheme. The analysis shows that thezone load distribution of Chord, Pastry and VS obeys the Gamma distribution withsimilar type of parameters, and if putting replica on k succeeding nodes in Chord, thezone load distribution obeys the Gamma distribution with parameter (k, n). Compared with VS and the trie-based zone balancing scheme, the static-replica-based scheme cannot only achieve the balance of zone load, but also make the system robuster.3. The dissertation presents an identity self-verified secure framework (ISV), whichis combined with Cards-shuffling scheme (ICS) to resist Sybil attacks efficiently. Anexplicit certificate distributor (CD) is introduced into ISV to audit the requiring for newidentifiers, and distribute signatures to nodes, whereas the verifying of identity iscarried out by nodes themselves, which reduces the overhead of CD server greatly. ICSmakes use of the tickets signed by CD to record the joining process of nodes toguarantee the compulsive execution of k-rotation rule. To prevent the enemy fromaccumulating identifiers, ICS utilizes the substituting sections and publish timestampsto determine whether the identifier is overdue. The dissertation analyzes quantitativelythe number of tickets stored on the nodes, and gives the asymptotic closed-formsolution of the problem. The analysis shows that the average number of tickets stored oneach node is O(log~2 n), and the simulation demonstrates that the asymptotic solution isvery accurate, which illustrating the correctness of the analysis.4. The dissertation claims that the problem to estimate the network size is identicalto the parameter estimation of the exponential or Poisson distribution in DHT, andproposes the average-interval-based estimator (AIE) and the node-density-basedestimator (NDE). The dissertation deduces the probabilistic distribution of the networksize estimation in AIE and NDE, and discusses how to select the testing scope andposition of NDE. The analysis shows that the estimating accuracy of AIE is only relatedto the number of intervals measured, and is irrelevant to the network size. This showsthat AIE can adapt the variation of network size. The simulation proves that the measureerror is in line with the theoretical analysis entirely.5. The dissertation proposes a random node sampling scheme based on rejectionprinciple (RPS). The probabilistic distributions of the sampled probability and sampledinterval are analyzed against the single point heuristic sampling algorithm (HUR) andthe multi-points heuristic sampling algorithm (HURk). And a rejection algorithmobeying the reciprocal distribution (RDRP) is constructed. The analysis shows that RPSsamples online nodes with equal probability and the interval between nodes sampledobeys the exponential distribution; its complexity remains constant while the networksize increases; the impact of estimating error of network size on RPS is irrelevant to the network size itself; if the estimation is less than accurate network size, RPS will loserandomness to a certain degree. Finally, the dissertation utilizes the statistical testingand empirical testing methods to verify the analysis strictly. The result shows that RPSis tolerant to the estimating error of network size very much. This proves that RPS is anefficient sampling algorithm.
Keywords/Search Tags:P2P network, Distributed hash table (DHT), System modeling, Sybil attack, Load balance, Random sampling, Estimation of network size
PDF Full Text Request
Related items