Font Size: a A A

Research On Risk-and-popularity-aware Recovery Schemes For Erasure-coded Clustered In-memory Stores

Posted on:2022-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:H J FengFull Text:PDF
GTID:2518306572490924Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
For the consideration of low access latency and high throughput,high-performance computing(HPC)usually uses memory as its storage medium to aggregate the computing power of multiple nodes.However,memory store often faces the following three technical challenges: First,memory storage needs to consider that data is temporarily unavailable,and for this,redundancy schemes need to be introduced to provide data fault tolerance;Second,under a huge node scale,nodes failures tend to occur concurrently,decreasing system reliability.In this regard,data reconstruction needs to assess the risk of data loss to ensure system reliability;Third,imbalanced user access may cause a small number of nodes to be overloaded.As a result,the overall access performance of the cluster is reduced,for this,it is necessary to guarantee the availability of memory data and user access performance.In order to ensure the reliability of erasure coded memory data reconstruction,a riskbased recovery scheme(Risk-based Recovery Scheme,Risk-RS)is proposed,which can quantify the risk of data loss in erasure coded stripes.The risks of stripes are divided into different levels according to number of failed blocks in stripes,and then the failed blocks with higher risks are recovered first,so as to ensure the high reliability and high durability of the memory data.In order to ensure the availability of erasure coded memory data reconstruction,the popularity-based recovery strategy is applied from the RAID system to the memory,that is,the popularity-based recovery scheme(Popularity-based Recovery Scheme,Popularity-RS).In order to ensure the reliability and availability of erasure coded memory data reconstruction,based on the Risk-RS scheme and Popularity-RS scheme,two hybrid recovery schemes(Risk-and-Popularity-aware Hybrid Recovery Schemes,RP-HRS),RP-HRS can be regarded as a combination of Popularity-RS and Risk-RS,one is popularity priority hybrid recovery scheme(Popularity-first Hybrid Recovery Scheme,PopularityHRS),and the other is risk priority hybrid recovery scheme(Risk-first Hybrid Recovery Scheme,Risk-HRS).In the meanwhile,the reconstruction time model is established,and various factors affecting the reconstruction time are taken into consideration,and then the influence of each factor on the reconstruction time is quantitatively analyzed.In order to evaluate Risk-RS,Risk-HRS and Popularity-HRS,Popularity-RS and Sequential Stripe Recovery(Basic)are designed as the benchmark scheme,and the above five schemes are performed by replaying the trace generated by YCSB.The experimental results show that compared with the Basic,Popularity-HRS?Risk-HRS and the Risk-RS can reduce the reconstruction time by 37.2%,52.5% and 55.6% respectively;Compared with Popularity-RS,Popularity-HRS,Risk-HRS and Risk-RS can reduce reconstruction time by 32.4%,50.1% and 52.3% respectively.Compared with Basic and Popularity-RS,Popularity-HRS and Risk-HRS both have better balance between access performance and reconstruction performance,and can effectively reduce the number of external memory reads under appropriate recovery ratios.
Keywords/Search Tags:Erasure code, Data reconstruction, Reliability availability, Risk quantification
PDF Full Text Request
Related items