Font Size: a A A

Research And Implementation Of Integrated Resource Allocation And Scheduling In Cloud Computer Center

Posted on:2016-10-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:C G WangFull Text:PDF
GTID:1108330509461070Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Supercomputer system has always been used for super-large scale scientific computing applications. In the context of big data and cloud computing, supercomputing system also plays an important role in offering resources, including computing, storage, communication and network resources, for large scale data processing and cloud services. The challenges of current Iaa S platform mainly exist in resource scheduling granularity, multidimensional resource requirement, and unreliable network performance etc. This thesis stems from the research and development of cloud platform in National Supercomputing Center in Guang Zhou. Integrated multiple physical resources scheduling technologies are comprehensively studied for the heterogeneous, high-speed connection, and tight coupled architecture of TH cloud. This thesis targets at building an efficient Iaa S cloud that offers finer granularity resources than general cloud that offers resources as virtual machines.The main contribution of this paper are:1. A resource graph based layered resource vector description method. A resource graph is made up of resource nodes and node links, where a resource node is layered descripted as a resource vector. The elements composing a vector include the computing capability, storage capacity and network bandwidth etc. Iteratively, each element can expressed as a vector. Taking computing capability for example, it can be expressed as a vector made up CPU, GPU, and MIC etc., and GPU can again be detailed expressed as core number, main frequency, shared cache size etc. The communication edge is described as a vector made up of communication relationship, link capability, and switch device etc. This description method can adapt to different grained resource requirement, and support integrated scheduling for node and link resources.2. A vector sorted decomposing based single node resource vector scheduling algorithm, named RVS. To search a single physical node that meets multi-dimensional requirements, the single node requirement vector is firstly decomposed into multiple single-dimensional vector. Then, all physical nodes are sorted according to their available capability in each dimension using quick sort technique. After that, suitable physical node is get through intersection operator. The RVS algorithm reduces the time complexity of searching a physical node of R-dimensional requirements in N nodes from O(R * N) to O(Rlog N).3. A particle swarm optimization based multi-node resource graph scheduling algorithm, named VCE-PSO. If multiple nodes are needed with communication capability constrained, RVS algorithm is called to generated available physical nodes.Then, those nodes composing a shortest path are searched. Each particle is expressed as a sub-graph of physical resource graph meeting required resource graph,and the state of a particle is computed as the total length of the path of all nodes and the link bandwidth consumption. The length of the path of each particle is calculated as fit function, and the path with least bandwidth consumption is selected as the final solution. Evaluation results show that VCE-PSO is able to optimize link deployment in short time using multiple parallel particles.4. A vector similarity based network link resource scheduling technology. The multidimensional resource utilization of switch is monitored, and the resource requirement vector of a network flow is calculated. Then, the cosine similarity between these two vectors is computed. The flow with higher similarity is chosen to be processed. If the resource is over utilized than the threshold, all flows are titled according to their similarities, and the send ends of those flows are feedback through Ack messages. The send ends will then adjust message sending rate. Through evaluation in cloud platform, this technology is able to improve throughput of 15%for Openvswitch.5. A data skew aware resource scheduling algorithm, named LBS-SA. The performance of large scale data processing is not only constrained by computing capability, but also by I/O bandwidth, network bandwidth and delay, and input data skew. To address this problem, LBS-SA firstly estimates the computing, I/O, and network resource requirements based on the data block number and size on each node. Then, suitable data blocks, that balance the data skew, are chosen from their replicas. The replica selection also takes node and network resource requirements into consideration, and is constrained by simultaneous completion. After that, resources are dispatched in accordance to their balanced computing weights. For a dataset of 100 G scale, this technology can reduce 12% overall completion time in Hadoop.These proposed technologies have been implemented in Nova, computing component of Openstack and Openvswitch, a software switch supporting Open Flow, and applied to TH cloud in National Super computing Center in Guang Zhou. Results show that, these technologies improve all kinds of resources utilization remarkably, reduce resource request time, and optimize big data processing performance in cloud platform.
Keywords/Search Tags:Cloud Computing, Super Computer, Resource Allocation, Link Resource Allocation, Parallelling Resource Allocation, Data Skew Scheduling, Congestion Control
PDF Full Text Request
Related items