Font Size: a A A

Performance Optimizations For Multi-Tenant Cloud Storage System

Posted on:2023-04-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:1528307172951949Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the growth of the Internet in recent years,there has been an explosion in the amount of data generated by applications.Cloud storage systems are becoming increasingly popular due to their high scalability,high availability,and simple data management methods.As applications migrate from stand-alone storage systems to cloud storage systems,changes in the storage environment bring new challenges to performing performance optimizations.Firstly,changes in storage system architectures lead to changes in the end-to-end storage path of requests.Second,with the popularization of cloud storage systems,the access characteristics of tenants tend to be diverse and continue to evolve.Third,a cloud storage system needs to provide storage services for multiple tenants simultaneously.These tenants have different requirements and compete for resources.To address the above issues,this paper studies the performance optimizations for multi-tenant cloud storage systems from two aspects of access characteristics and co-existing tenant management,combined with the considerations of the architecture of the cloud storage system.Aiming at data correlations among objects accessed by the tenant in the cloud storage system,data correlations-based policies to optimize access performance,Cora,are proposed.By leveraging access correlations among objects,objects that may be accessed can be prefetched in advance,and tenants can quickly fetch objects when subsequently accessed objects hit the cache.However,existing cloud storage systems do not have the mechanism to identify the correlated objects accessed by the tenant and lose the opportunity to use access correlations to prefetch objects and thus improve access performance.On the other hand,existing correlation-oriented optimization methods rely on the tenant’s access history to identify the set of objects with access correlation by mining the repetitive sequences that occur with high frequency,so two cases are ignored,one is the repetitive sequences that occur less frequently,and the other is the similar but non-repetitive sequences.To address these issues,Cora performs prefetching and improves access performance by exploring data correlations among objects accessed by tenants in the cloud storage system.Cora identifies collections of objects with data correlations by analyzing the content of the objects.For the correlated objects in the collection,Cora implements corresponding mechanisms for correlation maintenance and prefetching of correlated objects during object storage and object access,respectively,to maximize performance gains while minimizing overhead.The results of experiments with synthetic workloads and real workloads show that Cora significantly improves the access performance of workloads in the cloud storage system.Compared with the existing cloud storage system,latency and throughput of workloads are optimized by up to 55.39% and285.24%,respectively.To address the problems of unfavorable access performance of data-intensive applications in cloud storage systems,the access characteristic-aware performance optimization policies,Mass,are proposed.In cloud storage systems,data and metadata of objects are stored persistently in the local file system on the storage nodes.Data-intensive applications generate a large amount of metadata,which is complicated to manage.Moreover,most of the data generated by applications are small-sized objects,which are distributed discontinuously in the file system hierarchy,resulting in a large number of random accesses,and therefore,the access process of data-intensive applications on storage nodes is inefficient.On the other hand,cloud storage systems serve a wide variety of applications with different performance requirements,and a common optimization approach is not applicable to all applications.Mass implements a progressive optimization strategy for applications by analyzing the access characteristics of applications and using the relationship between access characteristics and performance requirements to implement appropriate optimization methods on storage nodes to alleviate performance problems on storage nodes.Specifically,the application workload is classified as read-intensive,write-intensive,or mixed read-write by the read-write ratio of application requests,and the performance requirements are judged to focus on latency,throughput,and both,respectively,according to which different read-write request ratio-aware optimization strategies are implemented at the storage nodes.In addition,the multi-characteristic-aware optimization policies are implemented in real application scenarios by combining object types and other features to further improve performance.Experiments with synthetic and real workloads show that Mass significantly improves the performance of intensive workloads in the cloud storage system with both basic optimization policies and comprehensive optimization policies.Compared with the existing cloud storage system,latency and throughput of workloads are optimized by up to52.2% and 213.32%,respectively.Targeting the issue that the multi-tenant storage policies of existing cloud storage systems cannot meet the end-to-end performance requirements of tenants and cannot cope with the changing needs of tenants,a software-defined cloud storage system based on data plane partitioning,C-Mass,is proposed.Cloud storage systems usually serve different tenants with a single fixed storage configuration,which cannot meet the different needs of tenants.Existing multi-tenant storage policies provide differentiated storage configurations by separating the forwarding paths of tenants,however,the limited configurability cannot meet the end-to-end performance requirements of tenants and cannot cope with changes in tenant requirements.To address these issues,C-Mass extends the implementation of multi-tenant policies by separating policy control from the data plane to enforce complete control over the end-to-end path of requests.In the data plane,the entire cloud storage system is partitioned in a logical concept of sub-stores and assigned to different tenants.The control plane monitors and collects information about the tenant requests and cluster nodes,and accordingly configures the optimization policies and the resources occupied in different sub-stores.Performance isolation between tenants is provided by controlling the resource occupancy of the sub-stores and implementing optimization policies in the sub-stores based on tenant demands.When the tenant demands change,the optimization policies and resources of the sub-stores are adjusted to cope with it.The results show that C-Mass significantly improves workload performance and system efficiency,and responds to workload changes in a timely manner.Compared with the existing cloud storage system,the latency and throughput of the workloads are optimized by up to 81.6% and 231.5%,respectively.
Keywords/Search Tags:Cloud Storage, Software-Defined Storage, Storage Service, Multi-Tenancy, Performance Optimizations
PDF Full Text Request
Related items