Font Size: a A A

Dynamic Data Balance In Scalable Storage In Cloud

Posted on:2013-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2218330362459405Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, Cloud Computing has been a hot research area in the FFeld of dis-tributed computing and Internet industry. Since the concept of Cloud Computing wasput forward in 2007, IT companies have developed a lot of products related to CloudComputing and seen it as the next-generation technology. Because of its ultra-largescale, high scalability and on-demand resource provision, Cloud Computing bringsabout new vision for application system development.In the context of Cloud Computing, this thesis studies the problem of dynamic da-ta balance in data storage in the virtualized environment. SpeciFFcally,'dynamic databalance'is the dynamic balance in the mass data storage system based on workloadof distributed server nodes, in order to help the storage system dynamically expand orshrink and ultimately to maximize resource utilization. The cloud computing platformproviders provide consumers with resources (including CPU, memory, hard disk andnetworkbandwidth,etc.) inthetermofon-demandprovision. Thus, inordertoachievemaximum utilization of allocated resources, the system administrator or the system it-self can dynamically control the data distribution according to the actual workload inthe system. In contrast to stateless HTTP servers, the problem of resource manage-ment in data storage is much more challenging, because the necessity of maintainingdata partition and data distribution in the system makes the problem more diFFcult.Furthermore, because traditional relational databases have limitation in scalability andelasticity, most of data storage systems deployed in cloud computing platform are dis-tributed data stores. Among them, the P2P data storage system is considered to be themost suitable one in cloud computing platform because of its fully decentralization andhigh fault tolerance. Therefore, in order to maximize the resource utilization in the s- toragesystem, thisthesisproposesanewdatabalanceframeworkandalgorithmsbasedon the real-time supervision over resource utilization in virtual machines and relatedtechniques in P2P data stores.Consequently, this thesis proposes the ALARM data balance model, which makesdecisions on data distribution according to the pre-deffned policies based on the re-source utilization of the system. In the ALARM data balance model, there are four im-portant operations, namely target operation, merge operation, split operation and moveoperation. Target operation is responsible for deciding whether the local machine isoverloaded or underloaded according to the performance of the system. If the localmachine is considered to be underloaded, then the merge operation will be triggered tomerge another machine with this one, and shutdown one of the two machines. If thelocal machine is considered to be overloaded, then the split operation will be triggeredto start a new machine to take over some of the workload from the overloaded one.At last, the move operation will move the related data in the merge operation or splitoperation.Finally, in order to validate and verify the correctness and effectiveness of theALARM model, an ALARM-based data balance module is designed, implemented,and applied to the Open Chord– a real P2P data storage system. Query requests tothis prototype system are simulated according to the user access logs of 98 World CupWebsite. In the experiments, the ALARM model helps the Open Chord system scaleup or down dynamically and autonomously– increase or decrease in the number ofserver nodes, according to the change of workload in the system. The experimentalresults show that the ALARM model could achieve expected effects in the real storagesystem.
Keywords/Search Tags:cloud computing, data stores, data balance, resourcemanagement
PDF Full Text Request
Related items