Font Size: a A A

Optimization Research Of In-Memory Key-Value Store System Based On Erasure Codes

Posted on:2024-08-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Q XiaoFull Text:PDF
GTID:1528307316991209Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the explosive growth of data volumes across various industries,modern business workloads are trending towards increased complexity,hybridization,and real-time processing.This undoubtedly brings unprecedented challenges to distributed data management.Distributed caching has become a core technology to support data centres and data-intensive applications,in which in-memory key-value storage,the main implementation of distributed caching,has become a critical component of many distributed systems.Given the global emphasis on energy conservation and in line with business needs,it is crucial to apply Erasure Coding(EC)strategies with high storage utilization to in-memory key-value store systems(IMKVSSs).EC can effectively improve storage efficiency,reduce data redundancy,and enhance data availability and durability.This not only enables enterprises to strike a balance between storage performance and green energy saving,but also aligns with the need for sustainable development.However,applying EC to IMKVSSs involves a high degree of complexity.Current research tends to be confined to specific application scenarios,where comparability is often lacking.This limitation restricts the universality of strategies,so that the original solutions often cannot be directly applied or need substantial modifications when scenarios change.Furthermore,inherent flaws in traditional EC strategies increase the resource consumption of IMKVSSs.The research in this paper focuses on two main issues,and extracts the following three major challenges: Firstly,how to comprehensively explore the usability of EC strategies in IMKVSSs;Secondly,facing the repair amplification problem existing in tra-ditional EC strategies,there are some innovative strategies in theory,but they are highly complex and require in-depth research and optimization;Finally,the EC strategies currently used in IMKVSSs are based on finite field operations,whose high computational complexity directly affects the speed of the coding engine.Therefore,it is necessary to innovate new coding strategies suitable for memory storage.Surrounding these three challenges,the main work and contributions of this paper are as follows:1.Designed an experimental platform for a distributed In-Memory Key-Value Store Systems based on erasure coding.This paper first analyzes the complexity of deploying EC strategies in IMKVSSs from multiple perspectives,and summarizes the existing research.Based on this,the paper implements a variety of redundancy strategies and the design of different types of nodes in both normal and degraded read states.Moreover,an experimental platform based on Docker container technology for EC redundancy strategies is constructed and implemented,which can instantiate clusters of various scales.Experimental results show that the performance of this experimental platform aligns with the trends of current research.2.A coding strategy based on product bit-matrix is proposed.In response to the repair amplification problem inherent in EC strategies and the impact of high complexity operations in finite fields on its performance,this paper first analyzes the computational complexity of operations within the finite field GF(2~w)and the computational complexity of product-matrix framework(PM)for regenerating codes.Based on this,a product bit-matrix coding framework(PB)is constructed.Concurrently,two heuristic algorithms are proposed to find the PB construction method with the least local XOR count.Experimental results show that the number of XOR operations for the optimized PB-MSR code is reduced by 11.73%,and the number of XOR operations for the optimized PB-MBR code is reduced by20.17%.3.A coding strategy with minimum storage overhead based on non-finite fields is proposed.Due to the fact that the computational complexity of multiplication in finite fields GF(2~w)tends to increase exponentially with the size of the field,this paper designs and implements a memory-oriented Shift-XOR Minimum Storage Regenerating code(Mem SX-MSR)in the non-finite field.The arithmetic operations in the non-finite field enhance the encoding and decoding speed of the Mem SX-MSR code.Compared to the RS code in the Jerasure library,the encoding performance is approximately 6-8 times faster,and the decoding performance is approximately 3-4 times faster.Furthermore,compared to the RS code,the Mem SX-MSR code has lower repair bandwidth overhead.4.A coding strategy with minimum bandwidth overhead based on non-finite fields is proposed.Similarly,to address the challenge of designing novel coding strategies suitable for in-memory storage,a non-finite field bi-directional shiftXOR Minimum Bandwidth Regenerating code(BSX-MBR code)is designed and implemented based on an in-depth understanding of the trade-off between storage overhead and bandwidth overhead.Additionally,the system BSX-MBR code is designed,and the parameter values for achieving optimal storage overhead for the BSX-MBR code are defined from a theoretical perspective.Finally,through comparisons with RS codes both theoretically and experimentally,it is found that although the redundancy storage of system BSX-MBR code increases by 1.27 to1.29 times compared to the RS code from the Jerasure library,its repair overhead only accounts for 20% to 40% of the RS code.
Keywords/Search Tags:Erasure codes, Regenerating codes, In-Memory Key-Value Store Systems, Redundancy Strategies
PDF Full Text Request
Related items