Font Size: a A A

The Research On WAN Distributed Storage Technologies Based On P2P Architecture

Posted on:2014-07-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:L YangFull Text:PDF
GTID:1268330425483969Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
P2P computing technology breaks out the inherent defect of the traditional client/server service model with strong robustness and expansibility performance. Presently, with the information explosion growth in the Internet, using P2P technology to construct mass data distributed storage system has become one of the most attractive storage organization modes. The topological consistency, node dynamic, heterogeneity and autonomy of P2P computing environment are the key problems and difficulties which need to be considered when building distributed storage system. To take advantage of the free computing resources, storage resources and network resources of personal computers on Internet to construct a massive distributed storage system for massive users, this paper studies distributed storage technologies which have been widely adopted at present, analyzes the main problems when applying P2P technology to distributed storage system, discusses the basic theories and algorithms in detail for constructing a P2P-based storage system. To solve the real-time and fault-tolerance performance deficiencies of existing P2P storage systems, this paper introduces a novel distributed storage framework-Region Semantic Awareness Storage System, called RSA-Store, and studies in detail the corresponding components, including storage overlay network, data management model, replica consistency maintenance method and load balancing algorithm.This paper puts forward a novel construction mechanism of P2P storage overlay network to improve network topological consistency. Existing researches improve topological consistency by measuring the network distances between nodes and then clustering those nodes into different groups. However, measuring network distances is instable and this method is easy to bring forward aggregation overhead. Based on regional semantic of the natural attributes of Internet structure, this paper establishes a hierarchical storage overlay network-RSA-HRing, and presents the topology maintenance mechanism, including a selection algorithm of super node and backup super nodes based on the combination of Push and Pull, and a robust algorithm-Super Node Failure Tolerance Algorithm, called SNFT-RA, to prevent the failure of super nodes. After that, an overlay network routing algorithm based on path vector (Path-Vetor) is proposed and the detailed implementation method has also been given. To overcome triangle inequality problem when using measurement metrics of traditional time delay and hop count, this routing algorithm introduces path vector to calculate network distances. Simulation results show that, RSA-HRing can significantly reduce the construction and maintenance costs of overlay network topology. In addition, P-V routing algorithm can greatly reduce the actual physical routing overhead while keeping the overlay network routing scale.For RSA-HRing, this paper puts forward a novel data management model, which combines storage users’access behavior with regional activity characteristics. Based on the regional awareness, this data management model utilizes static data placement strategy I (nter)-I (ntra) BS to ensure data’s accurate positioning and fault-tolerant performance. In the meantime, by analyzing behavior characteristics of the user regional activity, the model adopts a dynamic replica creation method RA-RCM to improve data access performance. By combining the data placement stratagy and replica creation method, this paper designs a localization algorithm and replica management mechanism in detail, and then uses mathematical method to analyze the access cost of this model in RSA-HRing and node failure influence on the data access success rate. Simulation results show that RA-RCM can save data positioning hop count significantly if cluster nodes scale and backup threshold are controlled reasonably. Moreover, I (nter)-I (ntra) BS can effectively deal with nodes failure, which enables the system has better data fault-tolerance, especially after introducing SNFT-RA.This paper proposes a replica consistency maintenance method based on Node Heterogeneity Degree-NHDCOM. Node heterogeneity is a typical feature in RSA-Store environment. Existing replica consistency maintenance algorithms are lack of consideration of node heterogeneity. NHDCOM introduces Node Heterogeneity Degree to denote node capability, organizes replica node with Chord protocol, and proposes a ring splitting algorithm based on node finger table. Theoretical analysis proves that this algorithm can help the source node to acquire other replica nodes’ NHDs at less cost. Then, this paper describes a problem model to solve minimum delay update-content tree and proposes a heuristic algorithms-MDUT-H with NHD. Simulation results show that NHDCOM has excellent efficiency and stability performance compared to existing algorithms.This paper puts forward a novel VS-split load balance algorithm-VSSLBA. The usually adopted load balance method is to create several virtual servers (VS) on one physical node on demand, which results in single virtual server problem-SVSP. According to the probability distribution of nodes’ interval in DHT-based overlay network, this paper proposes a load distribution model, then analyses and calculates probability of SVSP occurrence in detail. After that, a VS-split algorithm, which can not only solve SVSP but also save overhead of maintaining virtual servers is proposed. Simulation results show that VSSLBA can effectively solve SVSP and achieve better load balance performance.
Keywords/Search Tags:P2P, Distributed storage, Topology consistency, Data management model, Replica consistency maintenance, Load balance algorithm
PDF Full Text Request
Related items