Font Size: a A A

EVC: Elastic Virtual Cluster Deployment And Management

Posted on:2012-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:H B WangFull Text:PDF
GTID:2178330332999970Subject:Network and information security
Abstract/Summary:PDF Full Text Request
In this paper EVC (Elastic Virtual Cluster) is designed as a convenient and basic tool to instantiate virtual clusters on physical resources on-demand. The virtual clusters can then be used to run parallel jobs, thus solve the resource co-allocation and incompatible software execution environment problems. EVC introduces virtualization technology into grids and creates virtual clusters through grid jobs. A set of APIs are extracted and open for resource consumers to manage the virtual clusters. Our ultimate goal is to make EVC as a lightweight grid virtualization middleware and the resource consumers on top of it can interact with virtualized resources. In our work, EVC has been integrated into CSF4, which acts as the frontend server and receives users'job requests and pass these requests to EVC to instantiate dedicated virtual cluster with configured execution environment.EVC borrows ideas from VJM (Virtual Job Model) and improves it. VJM was designed by author's team to solve the resource co-allocation problem in grids. It models the resource co-reservation process. The main idea is two-stage job submission. When parallel job comes and passed to VJM, it first generates a collection of virtual jobs which hold the same resource requirement as the real parallel job, and then dispatch these virtual jobs to appropriate set of physical clusters. Each virtual job lines up in the local resource manager's job queues. Once the virtual job is scheduled, it starts a personal gatekeeper process and report the resource manager contact string back to VJM. After all the virtual jobs for one parallel job registered, VJM enters its second stage and dispatch the real job to these personal gatekeepers through DUROC protocol. This two-stage job submission separates resource co-allocation from the job execution and provides more flexibility. For example, we can design optimal strategy for each stage respectively. Another contribution of VJM is its resource selection algorithm, which aims to minimize the time of resource co-reservation. The algorithm also designs some strategies to cope with the deadlock and resource failure problem.EVC improves VJM on two aspects. First, virtual job in EVC creates a new virtual machine on its execution node while VJM virtual job creates a new personal gatekeeper process. Using virtual machine as an execution unit brings more convenience than using an operating system process. Compared with operating system processes, virtual machines have lower coupling with operating system and hardware platform therefore it can migrate between systems easier. Second, personal gatekeepers actually can be treated as fork clusters in grids. After VJM first stage, the end user gets a collection of resource manager contacts each indicate a personal gatekeeper, and he still must use DUROC protocol to dispatch every child process of the parallel job to the collection of personal gatekeepers or fork clusters. EVC aggregates the virtual machines into a single virtual cluster and dispatch real parallel job to this virtual cluster using GRAM protocol. This is simpler and faster than DUROC.The main problems in our research include the general form of resource requirement and execution configuration expressing, the support for multiple virtual machine monitors and hypervisors. the virtual job scheduling strategy, the design of virtual network for virtual machines communication, efficient virtual machine disk image distribution and update, the aggregation of the virtual machines to form virtual clusters, the virtual cluster lifecycle management and the mechanism to start real parallel job on virtual resources.In our implement and experiments we found the virtual machine image distribution and virtual job scheduling are the two bottlenecks. On one hand, grids consist of resources from multiple administrative domains each of which has different network and communication capabilities. To decide which set of physical clusters to be selected to hold the virtual jobs have huge impact on the virtual cluster performance. On the other hand, virtual machine image contains all the software packages and persistent configurations; as a result, the size of the disk image is usually several gigabytes. Distributing such large image files from one central image server to a number of computing nodes is a critical problem thus it must be solved efficiently and reliably. We propose an image cache strategy to avoid redundant transfer for image templates. Copy-on-Write technology is used to share a single image template across multiple virtual machines therefore minimize the amount of data which must be transferred to computing node. We also did a comprehensive research on a wide variety of data transfer protocols, for example, NFS for on-demand data transferring. FTP for unicast, RSYNC for quick synchronization, BitTorrent for distributed file sharing etc. Eventually we choose BitTorrent as our tool to distribute images.
Keywords/Search Tags:elastic virtual cluster, cloud computing, disk image distribution and reusing, tree-like multi-buffering file distribution, resource co-allocation
PDF Full Text Request
Related items