Font Size: a A A

Load Balancing In Structured P2P Network

Posted on:2014-11-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:D H LiuFull Text:PDF
GTID:1228330479479604Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
DHT has gained widely attention of researchers, and has been applied widely in traditional distributed systems and new distributed computation modes. After ten years development, the basic theories and core technologies of DHT have tended to be stable, and now researchers turn to focus on those problems encountered in practical applications, load balancing is one of the important aspects. Hotspots in traditional distributed systems also exist in the DHT based systems, and it is more important and difficult to be resolved in such systems: the weak capability of the nodes in P2 P networks make it important to achieve load balancing; Churn、heterogeneity and topology mismatching make it more difficult. This dissertation launches the research in view of the imbalance load caused by non-uniform query. The main contributions of this dissertation are as follows:(1) Severe skewed query would lead to imbalance load distribution, the nodes maintaining hot data will get higher load than other nodes in the system. In another way, replication/caching can contribute to achieve load balancing. Researchers have put forward many methods using replication scheme to improve the reliability of system, and the availability of data, and a great deal of theoretical analysis for efficiency、data maintaining and data consistency has been done, but there are lack systematic theoretical analysis for load balancing based on replication. For this problem, this paper take Chord as the sample, analyzes the access probability of different position copies based on the relative position in the query routing process, demonstrates that the probability of hit with an exponential growth according to the distance from the copy in the clockwise direction to the data object in descending order.(2) In those applications, such as SOA、search engines based on DHT, the severe skewed query will affect the availability of system, in another way locating resource with single word will reduce the efficiency of queries. This paper takes the problem as research background, and proposes a multi-attribute resource locating method(QFMA) for supporting load balancing. Based on MAAN, QFMA stores the resource description information in multiple nodes, and publishes the status information of nodes that contain the “hot” keywords through the routing path. The other queries do the target switching according to the status, and transfer the load of the nodes that contain the “hot” keywords to the other nodes with relatively light load to achieve load balancing. QFMA is effectively to balance the loads of the nodes that contain the “hot” keywords, and it can support multiple keywords searching, and it would not bring large overhead of the management of the copies.(3)In structured P2 P networks, the location information of nodes are easily lost in the process of generating the identifier, which would lead to the mismatching between the topology of physical network and logical network. Topology mismatching will affect the efficiency of resource location, and will increase routing and data transmission among autonomous systems. Replication/caching can contribute to achieve load balancing, however copy creation, copy management and query schedule will lead to extra cost of routing and data transmission. To resolve such problem, this paper proposes PB-Chord algorithm. Based on Chord network, PB-Chord allows the proximity nodes to join the cluster by taking the public DNS servers in the network as reference, the hot data are replicated in the idle nodes of the cluster, and all nodes have the global information of the cluster, including the load state information and data object information. In the query process, PB-Chord uses the above information to improve the efficiency of routing, and do the query scheduling without center, in order to balance the load of nodes in the cluster. Theoretical analysis and simulation results show that PB-Chord can achieve load balancing with duplication,with low cost of comunication, and can improve the efficiency of queries.(4) Obtaining the global load state in large dynamic P2 P network is a challenging work. In view of this situation, this paper precedes the exploratory research, and proposes an efficient algorithm(Hermes) that can effectively obtain the global load state in large scale structured P2 P networks, it implements the information propagation through the Gossip algorithm based on Push & Pull model. Besides, this paper proposes a novel approach to compress the global state information of the node, aiming at reduce the overhead of transmission and storage. Theoretical analysis and experimental results show that, it only need log N-2 cycles to spread the state of a node to the whole network with N nodes. If the load of the nodes is in accord with Zipf distribution, it requires small storage space to store the global state information of the large scale network. In the case with low state change of the nodes, this approach would not bring large communication overhead, and ensure high accuracy of the global state information.
Keywords/Search Tags:DHT, load balancing, replication, query schedule, Churn, Topology mismatching, Gossip
PDF Full Text Request
Related items