Font Size: a A A

Efficient Implementation Techniques for Block-Level Cloud Storage Systems

Posted on:2015-04-11Degree:Ph.DType:Dissertation
University:State University of New York at Stony BrookCandidate:Simha, Dilip NijagalFull Text:PDF
GTID:1478390017998411Subject:Computer Science
Abstract/Summary:
A fundamental building block for an IaaS (Infrastructure-as-a-Service) cloud service such as Amazon's EC2 is a storage virtualization system that provides block-level storage services to individual virtual machines over the network. This dissertation addresses four major problems in such a block-level cloud storage system, in the context of an end-to-end IaaS solution called ITRI Cloud OS. First, to effectively eliminate redundancies in stored data blocks, we propose a scalable block-level deduplication engine called Sungem, which uses both sampling and prefetching to minimize the performance overhead of fingerprint accesses, and features a storage block garbage collection algorithm whose run- time overhead is proportional only to the size of the delta between consecutive backup operations. Second, to efficiently flush meta-data updates associated with large-scale block-level storage management, we developed a novel storage system architecture called BOSC (Batching mOdifications with Sequential Commit), which uses largely sequential writes to commit updates to disk and is thus able to sustain high-throughput and low-latency metadata updates that are largely random. Third, as part of the BOSC architecture, we invented a high-throughput low-latency disk logging system called Beluga, which fashions a carefully tuned disk write pipeline and makes it possible to provide, on an array of three commodity 7200 RPM SATA disks, close to 5 million fine-grained (64-byte) disk logging operations per second, which is close to the maximum possible bandwidth on a commodity disk, while keeping the latency of each logging operation under 1 msec. Finally, we devised a set of techniques for supporting software-defined storage service on a distributed and replicated storage architecture. Specifically, we developed a distributed storage QoS guarantee system called Cheetah , which is able to provide a bandwidth guarantee to each virtual disk attached to a virtual machine, while ensuring the loads on the distributed storage nodes be balanced, and the locality of the access stream associated with each virtual disk be preserved as much as possible.
Keywords/Search Tags:Storage, System, Cloud, Virtual, Block-level, Disk
Related items