Font Size: a A A

Research And Implementation Of Distributed Cache System For The Cloud Storage Gateway

Posted on:2013-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:T ShenFull Text:PDF
GTID:2268330422973797Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, the amount of data generatedby enterprises and individuals grows rapidly. Traditional massive data storage systemshave poor scalability. They achieve scale-up through equipments upgrade, which leadsto the increase of the costs of system management and operation. Cloud storage systembased on the distributed file systems demonstrates advantages in storage capacity,scalability and high reliability, and is more and more widely used in massive datastorage areas. However, since the mainstream cloud storage system does not have aunified application interface, the existing applications built on top of different systemscan not directly cooperate with these cloud storage systems. It is difficult to rapidlymigrate these applications to the new platforms. Besides, the data security of cloudstorage is also a core issue that users concern about.In order to rapidly migrate existing applications to cloud storage platform andguarantee the data security, our research group has designed a cloud storage gatewaynamed JoinIn. JoinIn abstracts the interface of the backend cloud storage system to theinterface of a traditional file system, which is POSIX compatible. Metadata server ofJoinIn is located in the LAN and can only be accessed under specified access controlrule, while data is stored at the backend cloud storage system in the WAN.To address the problem of high latency and low throughput introduced by thecloud storage architecture of WAN. In this thesis, we design and implement adistributed data cache system for cloud storage gateway JoinIn. The main designphilosophy of JoinIn cache system is to use the locality of accesses. The cache systemsaves the frequent referenced items in the cache system which is close to the users.When the data is accessed again, it can be fetched quickly from the cache to avoidinteraction with the backend cloud storage system. Thus the data transmission delay canbe reduced, the load of backend servers can be alleviated and the bandwidth can besaved.The main contributions of this thesis include:1) We propose an architecture of cloud storage gateway JoinIn’s cache system.According to the features that memory cache is limited in capacity and volatile,we propose a novel two-level cache architecture consisted of memory and disk,which increases cache capacity and implements persistent storage of cachecontent.2) We propose a new replacement algorithm for cloud storage gateway JoinIn’scache system, called JoinIn_LRU. As the classical LRU algorithm does notconsider the access frequency, we propose a new algorithm which considersboth recency and frequency. 3) We design and implement a cache cluster architecture based on consistenthashing of virtual nodes. Considering the scalability of single-node cachesystem, we design and implement the distributed cache cluster architecture onthe basis of in-depth study of consistent hashing algorithm.This paper builds a test environment, and we comprehensively evaluate thefunction and performance of the proposed system. Experimental results show that readspeed of cloud storage has been greatly improved with the cache system. To summarize,the cache system proposed in this thesis can serve as an effective approach to improvethe QoE of cloud storage system.
Keywords/Search Tags:Cloud Storage, Cloud Storage Gateway, Cache, ReplacementAlgorithm, Consistent Hashing
PDF Full Text Request
Related items