Font Size: a A A

Research On Key Technologies Of Host-side Caching In Cloud Computing

Posted on:2021-12-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y FuFull Text:PDF
GTID:1488306548991299Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cloud computing system is an irreplaceable infrastructure supporting Internet application services and big data processing.As cloud computing system supports various data-intensive scenarios and the applications' consolidation continues to grow,it needs to tackle much higher I/O requirements.To improve the service reliability,virtual machines(VMs)in cloud computing system use distributed storage to store their critical data,but the distributed storage system has become an I/O bottleneck.With the rapid development of non-volatile memory technologies,solid state drives(SSDs)have been widely used in cloud computing system as host-side cache to improve the VMs' I/O performance.However,the unique I/O characteristics of VMs bring severe challenges for SSD caching,which are three-fold:(1)under Copy-on-Write(COW)virtual disks,the SSD caching is not efficient;(2)the journaling mechanism used in the VMs' file system reduces the SSD caching efficiency;and(3)the limited SSD cache space needs to be appropriately allocated to the VMs according to their Quality-of-Service(QoS)requirements.This paper proposes the following three caching solutions to address the above challenges.Copy-on-Write(COW)virtual disks are commonly used for VMs to provide storage due to their rich features.However,the richness of features introduces complex management in two ways: one is the additional management of the metadata in COW virtual disks,and the other is the I/O amplification induced by the COW mechanism.The complex management significantly changes the VMs' I/O pattern,which brings inefficient metadata performance and COW amplification problems to SSD caches.In this paper,we propose a COW-aware SSD caching system,i.e.,COWCache,that addresses these problems.First,COWCache designs a new architecture that bridges the semantic gap between SSD caching and virtual disk management and enables the cross-layer optimizations.Second,it separately manages COW metadata with a fine-grained caching and journaling mechanism to improve the metadata caching efficiency.Finally,it proposes a novel decoupled COW mechanism,which decouples the amplified I/O requests from the critical I/O path and only caches the data with real VM locality into SSD.Evaluations show that COWCache improves the VM performance by up to 122.7% and reduces the SSD cache writes by up to 78.5% compared to traditional SSD caching solutions.The file system in VMs typically uses journaling to ensure storage consistency.However,journaling introduces duplicated writes for file system modifications,i.e.,the logging I/Os to the journal area and the in-place updates to their original places.SSD caching with journaling induces inefficient logging I/O traffic to the distributed storage system and duplicated caching in SSD,which not only underutilizes SSD caches but also aggravates their wearout.In this paper,we propose a journaling-aware SSD caching solution named JCache to address these problems.First,JCache designs a virtual journal device to receive and deliver the journaling semantics in the VMs to the SSD cache manager.Second,to safely eliminate the logging I/O traffic to the distributed storage system,it devises a cache-only logging mechanism,which transparently uses the persistent SSD caches as the journal area and does cache-only logging.Finally,it proposes a logical caching mechanism,which eliminates the duplicated SSD caching induced by logging I/Os and in-place updates to mitigate the wearout of SSD caches.Evaluations show that JCache improves the VM performance by up to 11.4× and reduces the SSD cache writes by up to 42%compared to traditional SSD caching solutions.SSD caches are usually shared among multiple concurrently running VMs,and they can easily get oversubscribed by the increasingly larger data sets and competing demands of the VMs with different QoS requirements.We observe that(1)traditional miss ratio curves(MRCs)used to determine the relationship between a VM's cache space and cache performance overestimate its actual cache space requirements and(2)traditional SSD cache space allocation schemes do not adequately consider the individual VMs' QoS requirements,resulting in inefficient SSD cache space utilization.This paper presents QCache,a QoS-driven cloud cache management solution to address these problems.QCache proposes the Reuse Working Set based MRC(R-MRC)to support cache admission control and model only the data with good temporal locality when determining the relationship between cache allocation and cache performance.By using the R-MRCs to estimate the VMs' effective cache demands,QCache then proposes an online optimizer for cache allocation that minimizes the overall QoS distance to the VMs' hit ratio and I/O latency targets.Evaluations show that QCache reduces the overall distance to the VMs' QoS targets by up to 80.6% and reduces the SSD cache writes by up to 43.2%.
Keywords/Search Tags:Cloud Computing, SSD Cache, Virtual Machine, Virtual Disk, Copy on Write, File System Consistency, Miss Ratio Curve, Cache Space Allocation
PDF Full Text Request
Related items