Font Size: a A A

Improving and repurposing data center resource usage with virtualization

Posted on:2015-07-25Degree:Ph.DType:Dissertation
University:The George Washington UniversityCandidate:Hwang, JinhoFull Text:PDF
GTID:1478390017499632Subject:Computer Science
Abstract/Summary:
Growing demands for storage and computation have driven the scaling up of data centers---the massive server pools that run the applications of businesses, individuals, and research groups. A data center can comprise thousands of physical servers and each physical server, technically, can have hundreds of virtual machines depending on data center resources---CPU, memory, disk, and network. These data center resources are used by highly distributed applications, causing many interesting resource management problems. In this dissertation we investigate challenges to improve the efficiency of data center resources. Specifically, we emphasize how the design of new virtualization technologies and distributed-aware systems can improve the efficiency of data center resources, and enhance application performance and data center management. We first study the performance aspects of the most widely used virtualization technologies (Hyper-V, KVM, vSphere, and Xen), and data center resource usage statistics to show the main problems we are facing. Then we suggest three major causes to the problems are interference, under-utilized resources and virtualization overheads, resulting in failure to maximize application performance. In many cases, these problems can be solved by analyzing application workload characteristics, granting the hypervisor greater control over data center resources, and/or bypassing virtualization overheads. The dissertation's first focus is on how using application workload characteristics can help schedule resources. We propose a new CPU scheduler in a virtualization layer to help the system decide how to prioritize virtual machines based on their workload characteristics. This provides better user experience by adaptively scheduling virtual machines based on the priority. We also develop a hash space scheduler to control distributed memory cache systems. As opposed to the current method of assigning hash space statically, we utilize application workload characteristics to decide how to allocate the hash space to achieve the maximum performance. We then investigate how the virtualization layer can better manage under-utilized data center resources. Data center servers are typically overprovisioned, leaving spare memory and CPU capacity idle to handle unpredictable workload bursts by the virtual machines running on them. We propose a new memory management system to repurpose the use of spare memory that is not used actively. We extend this work even further to support a hierarchical memory structure by using a second-layer of flash to substantially increase the cache size. Lastly, we propose a way of bypassing virtualization overheads. Specifically, software routers, software defined networks, and hypervisor based switching technologies have sought to reduce the cost of virtualization overheads and increase flexibility compared to traditional network hardware. However, these approaches have been stymied by the performance achievable with commodity servers. These limitations on throughput and latency have prevented software routers from supplanting custom designed hardware. To improve this situation, we propose a platform for running complex network functionality at line-speed using commodity hardware by bypassing virtualization overheads.
Keywords/Search Tags:Data center, Virtualization, Application workload characteristics, Propose
Related items