Font Size: a A A

Research Of Replica Location And Replica Placement For Massive Data

Posted on:2007-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:W RuanFull Text:PDF
GTID:2178360215970402Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the high-speed development of digital revolution and network technology, the scale of data sets in Internet presents a trend of exponential growth. A large number of data-intensive applications often produce massive data, which makes it become a research hot that how to store and manage massive data effectively in Internet. Data replication can reduce access latency, improve data availability and depress node failure rate in massive data management. In this paper, based on the research of massive data and existing massive data management systems, the features are summarized and analyzed, and replica location and replica placement are studied.One of the shortcomings for popular DHT location technologies in structured P2P systems is topological disparity between overlay and underlying networks, another is the problem of network hotspot. A hierarchical DHT location method based on geometrical space partition called HLSP is proposed according to that. HLSP utilizes Global Network Positioning (GNP) to construct overlay networks on the basis of "geographical proximity". HLSP designs hierarchical DHT based on geographical scope so that location information of data objects can be distributed on several nodes which meet some logical hierarchy in network. HLSP offers several operations to data objects, such as object publish, withdraw and query operation. The actual packet forwarding in HLSP is based on a greedy algorithm guided by a destination coordinate stamped in the packet header. HLSP supports dynamic maintenance of local node-neighbor lists altered with node joining or leaving. The simulation results show that HLSP is an effective DHT location method, and overlay topology based on geographical proximity can resolve topological disparity between overlay and underlying networks, and hierarchical DHT based on geographical scope can avoid network hotspots and achieve high efficient searches.When it comes to replica placement problem under latency constraints, a dynamic latency driven replica placement algorithm called DLDRP is proposed, in which replicas are always placed on the nodes that are under latency constraints. In DLDRP, it first gives an abstract geometric model, following which the regions of candidate replicas are determined. In this way, the problem is transformed into how to get the region that meets some restrictive conditions in geometric space. Afterwards, DLDRP adjusts the placements of replicas dynamically according to the load source of new replica, that is, a new replica is added in a certain node or several existing replicas are merged. The simulation results show that DLDRP can meet the delay bound of client effectively, reduce access latency, minimize the number of replicas and achieve a uniform distribution of replicas and load balance of nodes.
Keywords/Search Tags:massive data, data replication, replica location, replica placement, Distributed Hash Table (DHT), latency estimation
PDF Full Text Request
Related items