Font Size: a A A

Research On High-efficient Data Transmission Techniques In Large-Scale Distributed Erasure-Coded Storage Systems

Posted on:2016-05-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:1108330509461009Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The explosive data growth brings new challenges to the big data storage. Commodity servers and disk drives are more and more used to deploy storage systems due to economic concerns, which makes the data reliability a significant problem in system designs. Distributed storage provides high reliability via maintaining redundant data for fault tolerance. The erasure-coded storage becomes one of the most popular reliable storage solutions due to its high storage efficiency from generating redundant data by erasure coding. Most erasure-coded storages are deployed in large-scale clusters for data backup and archiving. The data encoding, placement, access and recovery generate a burst of traffic in erasure-coded storage systems. The burst traffic occupies substantial bandwidth and results in the performance degradation of the whole cluster. Consequently,the applications and services hosted in the same data center suffer by the burst traffic as well. Moreover, more traffic incurs more energy consumption. Therefore, the research on high-efficient data transmission techniques in large-scale distributed erasure-coded storage systems shows substantial research and application importance.The research on erasure-coded storages has many challenges. First, reliability, storage efficiency, and data reconstruction costs are three main concerns in erasure-coded storages which trade-off each other. To achieve the same quality of reliability, increasing storage efficiency(reducing the amount of redundancy) usually leads to the raise of the complexity and costs of data reconstruction, and vice versa. Therefore, the trade-off of reliability, storage efficiency, and data reconstruction costs is a significant challenge.Second, bandwidth cost and latency are two fundamental concerns in data reconstruction. Generally, reducing bandwidth cost contributes to the decrease of latency. However, sometimes, reducing bandwidth cost leads to the increase of transmission hops, which shows a contradictory effect of these two factors. Therefore, obtaining suitable tradeoff between these two concerns is a challenging problem. Third, large-scale distributed erasure-coded storage is the integration of logical erasure code schemes and physical cluster network topologies. Most existing work overlook the practical network topology of the cluster, but use an ideal connected graph as the overlay in data transmission, which can not truly express the transmission cost. Since the data transmission route in practical network topologies highly impacts the transmission cost, considering physical cluster network topologies along with erasure code schemes in erasure-coded storages is another challenge.To well address the difficulties and challenges above, in this thesis, we focus on the main parts of erasure-coded storage systems such as data encoding, placement, and reconstruction. The encoding part generates redundant data for fault tolerance. Then the redundant data and original data are stored on the cluster by data placement. When failures occur and lead to data loss, the lost data can be recovered and accessed through data reconstruction. This thesis systematically investigates some key issues of high-efficient data transmission techniques in large-scale distributed erasure-coded storage systems as follows.First, we focus on the transmission cost in data reconstruction which is the most important part of erasure-coded storage systems. We consider the erasure code scheme and practical cluster network topology together to propose Aggregation Decoding. Aggregation Decoding exploits the aggregation feature of data transmission in the erasure decoding, and splits the final decoding into several partial decodings, then performs partial decoding in disperse and parallel to reduce overall data transmission cost. We consider the practical network topology to construct efficient route for best exploiting Aggregation Decoding. We formulate the routing problem and propose an ant-colony weighting-based heuristic routing algorithm. Aggregation Decoding and the heuristic routing algorithm achieve significant transmission cost savings in erasure-coded storage decodings.Based on the research on single-failure data reconstructions above, we further investigate the multi-failure recovery. We survey the related literature to demonstrate the significance of multi-failure recovery and provide a Markov-based multi-failure model.We observe data duplication and information redundancy in multi-failure recovery, and propose Redu to reduce the redundant data transmission. In Redu, we propose mergingbased de-duplication to reduce duplicated data transmission, and aggregating-based deredundancy to reduce redundant information transmission. Moreover, we propose cooperative routing to efficiently exploit the two schemes above based on the practical cluster network topology.For the data placement in erasure-coded storage systems, we study the redundancy layout placement problem. Redundancy layout represents the mapping of redundant data and original data in erasure code schemes. Redundancy layout placement refers to the match of the redundancy layout and the cluster network topology, which determines the placement of data fragments on the storage cluster. We consider the practical heterogeneous failure pattern, and study how redundancy layout placement impacts the reconstruction performance. Then, we propose He Match, which optimizes the placement of erasure-coded data fragments based on heterogeneous failure patterns to reduce the transmission cost of data reconstructions and improve the system reliability as well.We investigate the erasure encoding and data write process to reduce the transmission cost of the encoding phase in erasure-coded storage systems. We notice the aggregating and spreading features of data transmission in erasure encodings, and propose tree-based cooperative encoding method Redu E based on these features. We propose trunk-based de-duplication and branch-based aggregation encoding to reduce the overall transmission cost of the encoding phase in erasure-coded storage systems.To summarize, our works present solutions to several essential issues of data transmission cost in large-scale distributed erasure-coded storage systems. Comprehensive experiments demonstrate that the proposed solutions properly achieve their design goals.
Keywords/Search Tags:Distributed Storage, Erasure Code, Transmission Cost, Data Reconstruction, Multi-Failure Recovery, Redundancy Placement, Erasure Encoding
PDF Full Text Request
Related items