Improving the end-to-end latency of datacenter applications using coordination across application components

Posted on:2016-11-02

Degree:Ph.D

Type:Dissertation

University:University of Illinois at Urbana-Champaign

Candidate:Jalaparti, Virajith

Full Text:PDF

GTID:1478390017977609

Subject:Computer Science

Abstract/Summary:

To handle millions of user requests every second and process hundreds of terabytes of data each day, many organizations have turned to large datacenter-scale computing systems. The applications running in these datacenters consist of a multitude of dependent logical components or stages which perform specific functionality. These stages are connected to form a directed acyclic graph (DAG), with edges representing input-output dependencies. Each stage can run over tens to thousands of machines, and involves multiple cluster sub-systems such as storage, network and compute. The scale and complexity of these applications can lead to significant delays in their end-to-end latency. However, the organizations running these applications have strict requirements on this latency as it directly affects their revenue and operational costs.;Addressing this problem, the goal of this dissertation is to develop scheduling and resource allocation techniques to optimize for the end-to-end latency of datacenter applications. The key idea behind these techniques is to utilize coordination between different application components, allowing us to efficiently allocate cluster resources. In particular, we develop planning algorithms that coordinate the storage and compute sub-systems in datacenters to determine how many resources should be allocated to each stage in an application along with where in the cluster should they be allocated, to meet application requirements (e.g., completion time goals, minimize average completion time etc.). To further speed up applications at runtime, we develop a few latency reduction techniques: reissuing laggards elsewhere in the cluster, returning partial results and speeding up laggards by giving them extra resources. We perform a global optimization to coordinate across all the stages in an application DAG and determine which of these techniques works best for each stage, while ensuring that the cost incurred by these techniques is within a given end-to-end budget. We use application characteristics to predict and determine how resources should be allocated to different application components to meet the end-to-end latency requirements.;We evaluate our techniques on two different kinds of datacenter applications: (a) web services, and (b) data analytics. With large-scale simulations and an implementation in Apache Yarn (Hadoop 2.0), we use workloads derived from production traces to show that our techniques can achieve more than 50% reduction in the 99th percentile latency of web services and up to 56% reduction in the median latency of data analytics jobs.

Keywords/Search Tags:

Latency, Data, Components

Related items

1	ERDQN Data Scheduling Algorithm For Low-latency Transmission
2	Research On Achieving Consistent Low Latency For Commodity SSD Arrays
3	The Low Latency Of Data Transmission Design Based On FPGA
4	Cost-Effective Support for Low Latency Cloud Storag
5	Research On Low Latency And Energy Efficient Data Gathering In WSN
6	High Throughput Low Latency Congestion Control Method In Data Center Networks
7	Improving Datacenter's Resource Efficiency With Fine-grained Control Over Cloud Service Components
8	Flexible and efficient control of data transfers for loosely coupled components
9	Research On Green Latency-aware Data Deployment In Cloud Data Centers
10	Ant Colony Optimization Of Virtual Machine Placement For Data Latency Minimization In Cloud Systems