Font Size: a A A

Distributed Storage Framework Design For Heterogeneous Data Reliability Requirements

Posted on:2022-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2518306788456614Subject:Telecom Technology
Abstract/Summary:PDF Full Text Request
In information age,the data scale is growing,the reliable storage of massive data is worth studying.Erasure coding(EC)is a block-level redundancy storage mechanism,which can provide high fault tolerance with low redundancy,therefore it's widely used in distributed storage systems(DSSs)and cloud storage systems(CSSs).However,EC needs to transfer a large number of available blocks across domains when repairing broken data caused by node failures,and this bottleneck leads to the failure of EC to give full play to its performance in DSSs.Many related works have been conducted to optimize the repair performance of EC,but their research is mainly directed at reducing the number of cross-domain transfer blocks while repair data.At the same time,study shows that the storage medium reliability has a large heterogeneity in large-scale clusters.Thus,the blocks mapping strategy when data are written by EC will directly affect the probability of normal read,degraded read and read failure thus data access performance,while in actual production life,the value density of data is very unevenly leading to various data access requirements,so the reliability requirements of data also heterogeneous.Based on the discussion above,this paper designs a new distributed storage framework supporting EC,which further alleviates the performance bottleneck of EC mainly from two aspects: the block location method and the data compression.This paper's main work is summarized as follows:Analyzing the impact of different block layout methods on various data read probabilities in the heterogeneous storage media reliability environment,and proposed an intelligent block layout scheme based on DQN network,which can reduce the data redundancy as possible thus reduce the storage overhead on the premise of meet different data access requirements.Using data compression technologies to compress the blocks generated by EC,reducing the amount of network transmission while further enhancing the storage utilization of EC in DSSs.Implementing the proposed new distributed storage framework in our real cluster environment and comparing its performance against some state-of-the-art EC schemes.Experimental results show that: The write performance of He Re EC can increase by4.8% to 12.7% compared to the erasure coding itself,while the normal read and degraded read performance can increase by 18.2% to 22% and 27.4% to 30.3%respectively.
Keywords/Search Tags:distributed storage system, redundant storage, erasure coding, data compression, DQN
PDF Full Text Request
Related items