Font Size: a A A

A Scalable Distributed Coordination Service On Cloud Platform

Posted on:2014-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:H H LinFull Text:PDF
GTID:2268330422462228Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of cloud computing, cloud service and data analysis arecommonly built on the platforms of thousands or ten thousands of machines. In such largescale environment, software and hardware errors, e.g. node failure, disk crash and networkproblem, are occurring more sequence than before. It is essential to target the problem andguarantee fault tolerant and high availability of service. Coordination service is a criticalway for fault tolerant and high availability on cloud platform. Recently there are manycoordination services systems, e.g. Google Chubby and Yahoo! ZooKeeper, which exploitleader-centric to synchronize transactions for consensus. In the leader-centric datareplication, write performance is worse while replicas are scaling which is insufficient onthe recent cloud platforms. Thus to achieve high scalability and high performance ofcoordination service on cloud platform, many issues must be considered to improve thescalability and update performance while guaranteeing the consistent property andimproving the ability of fast recovery from node failure.Giraffe is a scalable coordination service on could platform. There are three mainparts in Giraffe: coordination service block, dynamic membership management andconsensus protocol. Coordination service block uses in-memory data organization, andasynchronous watch mechanism for data modification for both distributed algorithms andcoordination service. Dynamic membership management exploits interior-node-disjointedtrees to organize coordination service and achieves load balance and reliable distribution.Consensus protocol implements a novel paxos variant which runs paxos instance based ongroup and achieves high performance of update transactions. Through the three parts, ascalable coordination service is implemented for cloud in large scale environment.Giraffe is evaluated on a high performance computing test-bed located in HuazhongUniversity of Science and Technology. The experimental results show that Giraffe hasbetter update performance than ZooKeeper while scaling. The update performance ofGiraffe is3X faster than ZooKeeper on update operation when ensemble size is50.Moreover, Giraffe reacts and recoveries more quickly than ZooKeeper when node fails.
Keywords/Search Tags:Cloud computing, Coordination service, Scalability, Fault-tolerant
PDF Full Text Request
Related items