Font Size: a A A

A Study On Large Scale Peer-to-Peer Search And Applications

Posted on:2009-09-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C LuoFull Text:PDF
GTID:1118360275480081Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The peer-to-peer (P2P) computing model has been motivating many emerginglarge scale distributed applications in recent years.Due to the advantages such as lowcost,self-organization,robustness,scalability,privacy protection,and so on,P2Pmodel has become a promising architecture for constructing large scale distributedsystems.Millions of people have utilized the P2P tools to share diverse resources onthe Internet.Different from traditional centralized systems,P2P-based decentralizedinformation sharing applications,for example,P2PWEB,P2P-based WIKI,bring aboutnew challenges.Among them,efficient resource localization becomes the firstbottleneck challenge with the absence of a centralized repository.Meanwhile,P2Pusers commonly prefer a fair exchange of shared data,reflecting autonomy andfairness spirits of P2P model.However,fair exchange in P2P remains a challenge inthis research area.It is difficult,if not impossible,to maintain a Trusted Third Party(TTP) in a P2P environment.To address above problems,both replication and caching strategies are employedin this work to achieve efficient peer-to-peer search performance.A fair exchangeprotocol in a P2P environment without a dedicated TTP is also proposed to enrich thesharing models.The main contributions of this work are as follows.1.A replication-based probabilistically exhaustive search P2P system RPXS isproposed.Firstly,the probability model of exhaustive search is analyzed.Themathematical proof shows that random replication strategy achieves the optimal results.By randomly distributing rather small numbers of item and query replicas in theunstructured P2P network,perfect search success rate can be guaranteed with highprobability.The analysis also shows that the cost for such replication strategy isdetermined by the network size of a P2P system.Hybrid P2P architecture is proposedto address the problems of network size estimating and random peer sampling,whichcombines a lightweight DHT with an unstructured P2P overlay.Based on the services, RPXS can replicate data items/queries to random node subsets to efficiently implementprobabilistically exhaustive search.The analysis shows that RPXS has reasonablesearch cost and is very suitable for large scale text search.Comprehensive simulationis conducted to evaluate this design.Results show that the proposed scheme achievesperfect search success rate with quite small overhead.2.To further lower down the search cost,a replica network-based multi-hitquery P2P system is proposed.Strategies for correlating replicas of a data item aredesigned,which construct a logical replica network for each data item.The relatedparameters are determined by mathematical analysis and experiments.Based on thereplica network,queries can reach other replicas with only one hit by the replicapointers.Therefore,the multi-hit query can be achieved with low cost.The proposedreplica management strategy can be incorporated into RPXS to implement exhaustivesearch with low overhead.3.An interest-aware caching strategy for improving the P2P search performanceis proposed.Nodes advertise their resource information.Other nodes receive theinformation and cache information which they are interested in.To reduce the networkand storage overhead,the Bloom filter is employed to represent the interest of a node.A GOSSIP-based resource advertising algorithm is designed to propagate the resourceinformation rapidly in dynamic P2P network.To measure the importance of theresource information to a node,an availability potential model is proposed.Theexperimental result shows that the proposed caching strategy can improve the searchperformance efficiently.4.A sampling network-based node sampling scheme is proposed.Firstly,theparameter setting for efficient node sampling is analyzed.Then,a sampling networkconstructing algorithm is present.At last,a random walk-based node samplingalgorithm is proposed.The simulation shows that the proposed node samplingalgorithm has low overhead,low latency,and low deviation.The proposed nodesampling algorithm can be used as a building block of the distributed TTPconstruction.5.A fair exchange protocol in a P2P environment without a dedicated TTP isproposed.In order to provide the fairness of exchange,two parties involved in atransaction jointly select n random nodes to construct a distributed TTP.The Identity-based Encryption (IBE) scheme is employed to guarantee the secret ofexchanged content.To prevent the collusion of nodes in TTP,the threshold encryptionscheme is used and it also tolerates the failure of nodes.The analysis shows that theproposed scheme can provide the fair exchange requirements in a P2P environment.The results of this work can improve the search performance of large scale P2Pnetworks efficiently and enrich the resource sharing models.
Keywords/Search Tags:peer-to-peer computing, unstructured P2P, resource search, distributed hash table
PDF Full Text Request
Related items