| With the broad emerging of data-driven business,in-memory storage system is increasingly becoming an important building block of data center.In-memory storage system provides sub-millisecond latency and improved concurrency of user application.Due to the volatile nature of memory and the expansive scale of deployment,data loss becomes prevalent.When data loss occurs,it’d be expensive to recover.Therefore,it’s indispensable to implement fault tolerance for in-memory storage system.Typically,there are two redundancy fault-tolerant,namely replication and erasure coding(EC).Replication has higher performance,while EC has higher storage efficiency.To reach a proper balance between time and space overhead,we could employ a hybrid fault-tolerant scheme that combines replication with EC in inmemory storage system.Moreover,some data have dynamic storage performance and efficiency requirements.Hence in-memory storage system is required to dynamically change fault tolerance scheme to adapt to changing demands for data storage.Thereout,we implement ElasticMem,a hybrid fault-tolerant system,based on Memcached,a popular distributed in-memory storage system.The research work and main contributions of this thesis are summarized into the following three aspects:(1)Hybrid fault tolerance and redundancy transitionWe implement ElasticMem,a distributed in-memory storage system employing a hybrid fault-tolerance scheme which incorporates both replication and EC.ElasticMem supports the flexibility to use replication or EC for each data to be stored,and can conduct redundancy transition to dynamically change the redundancy scheme of data,to adjust to the changing data storage requirements.(2)Design EC oriented replication to optimize redundancy transitionWe introduce Erasure Coding Oriented Replication(EOR)layout in ElasticMem.EOR determines the data placement of replication according to data layout of EC,which significantly reduces IO overhead of redundancy transition and improved its performance.At the same time,EOR can still provide as much fault tolerance as replication,and similar access performance.(3)Lightweight and efficient solution to concurrent consistency problemsFor data using block storage schemes such as EC,we point out potential consistency issue when accessing it concurrently and analyze its causes.We implement a table-based lightweight scheme in ElasticMem that detects out corelated concurrent read and write requests and schedules them to avoid consistency issues.In addition,we design data bypass to serve subsequent co-related requests with local data,which saves network overhead and improves access performance.Our testbed experiments show that ElasticMem reduces the redundancy transition time by up to 35%by leveraging EOR.Additionally,with data bypass,ElasticMem can reduce the latency of single request to at most 6us,and remarkably reduce overall latency of multiple concurrent requests with dependency. |