Font Size: a A A

Research On Load-balanced Strategy In Cloud Computing

Posted on:2017-03-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:K LiuFull Text:PDF
GTID:1108330482494774Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of internet technology, cloud computing as a new business model has came into being. It is a syncretic product of parallel computing, grid computing, virtualization, distributed computing, network storage and load-balanced technology. As the emergence of cloud computing technology, the original clients’ works can be executed in the cloud and users only need to buy the service what they need. So the cloud data centers will undertake complex and overload works.Cloud data centers involve tens of thousands of servers or network equipments which are heterogeneous and the users’ demands are also diverse, complex and real-time. This will surely lead to imbalance in the cloud data centers. So how to solve the load-balanced problem of cloud data centers is a very important problem.At present, with the rapid growth of internet users, massive data needs to be stored, and it is obviously that the requirements can not be met only by upgrading servers’ hardware performances. The better solution is to store these data in the cloud and cloud computing provides the ideal solutions for these massive data storage. But the nodes in the cloud will be unbalanced because of nodes’ uneven distribution, nodes’ different configuration, different resource access popularity and etc. This imbalance will affect the system performance and more seriously will cause the nodes shutdown. So we must solve the load-balanced problem of data storage in cloud computing.One of the most important issues in cloud computing is the resource scheduling of cloud data centers. Due to the diversity of the nodes inthe cloud and heterogeneity of users’ needs, data load on each node will be unbalanced. Some nodes’ load is heavy and they are very busy, while on the contrary some other nodes’ load is light and they are very easy.This phenomenon will affect the performance and resource utilization rate of the whole system. So we must solve the load-balanced problem of cloud resource scheduling. A good load-balanced strategy can not only effectively avoid the uneven distribution of network load, traffic congestion, long response time and other bottlenecks, but also improve the application efficiency.Based on the necessity analysis of load balancing in cloud computing,in this thesis, a load-balanced optimized strategy of cloud storage for Hadoop, a cloud storage load-balanced model based on multi-objective optimization, a resource scheduling load-balanced strategy based on virtual machine migration and a load-balanced strategy based on dynamical replica management are designed for the load balancing in cloud computing.The contributions of this thesis are as follows:1.Data will be divided into blocks in HDFS file storage system and each block has some replicas to ensure data redundancy. As continuous storing and deleting the blocks, the nodes ’ storage load will be unbalanced. In order to solve this problem, data will be migrated from heavy load nodes to light load nodes. The equalization process will be done in the racks which have the heavy load firstly and then be done among other racks. In this way, the equalization time of heavy racks will be delayed. This paper has put forward the optimized strategy. The first optimization is handling the heavy load racks preferentially. A load threshold is defined and the racks are heavy when their load values exceed the threshold. The heavy racks will be balanced firstly and data on the heavy racks will be migrated to light racks. The second optimization is constructing two queues. The heavy load nodes’ queue is sorted inascending order and the light load nodes’ queue is sorted in descending order. Two nodes are selected from each queue sequentially and data will be migrated from heavy node to light node.2.Many algorithms only take into account the storage space when solving the load-balanced problem of cloud data storage and data will be migrated from heavy node to light node. But each node’s performance is different and the node whose configuration is strong has higher processing ability. The bandwidth between each node is different in the network and the node with big network bandwidth can respond to more users’ demands.The file access popularity is also different and the load on the node which has high access popularity will be heavy. This paper has present a load-balanced model based on multi-objective optimization. The algorithm evaluates the load value according to file’s size, file’s access frequency,node’s CPU utilization, node’s memory size, bandwidth and other factors.The migration will be done based on the comprehensive load values.3.In order to avoiding data invalidation, data usually has some replicas to ensure the reliability in cloud storage system. The replicas will be put on different nodes of different racks. The burden of the nodes will be increased if the frequency of file access is high. Replicas can be added on the other nodes to share the burden. This paper has present a replica manage strategy based on file access frequency to solve the problem of load balancing. The strategy determines the number and the location of replicas according to the file access frequency, node’s storage space, network bandwidth, replica consistency maintenance cost and other factors. The replica will be deleted when the file access frequency is low or it has not been accessed for a long time.4.The nodes in the cloud are cooperated to complete the works and respond to users. The number of these nodes is too large, the nodes’ geographical positions are dispersed, the heterogeneity between thesenodes is very strong and the users’ applications are diverse, complex and real-time. Some nodes in the cloud are busy while other nodes are idle.So we must have a reasonable scheduling method to handle the imbalance.This paper has present a load-balanced strategy for resource scheduling based on virtual machine migration and resource will be scheduled from heavy load nodes to light load nodes. The strategy includes the gathering module, monitoring module, predicting module, selecting module and migrating module. Gathering module is responsible for gathering the load values of each node, and the load values mainly include CPU utilization rate, memory utilization rate and bandwidth utilization rate. The gathering will be processed by the central node from time to time or sending data values to the central node from each node actively. The monitoring module is responsible for determining the high load node and the low load node. A high threshold, an adaptive threshold and a low threshold are set as the judgment conditions. The predicting module and monitoring module decide whether to trigger the migration. In order to avoiding unnecessary migration caused by the peak values, this paper takes the exponential smoothing method as the predicting method. The selecting module is responsible for selecting the source machines and the target machines. The selecting strategy based on information entropy is used to determine the weight objectively.
Keywords/Search Tags:Cloud Computing, Load Balancing, Cloud Storage, Resource Scheduling, Hadoop, Virtual Machine Migration, Replica
PDF Full Text Request
Related items