Font Size: a A A

Achieving Efficient I/O with High-Performance Data Center Technologies

Posted on:2016-05-17Degree:Ph.DType:Dissertation
University:University of California, San DiegoCandidate:Conley, Michael AaronFull Text:PDF
GTID:1478390017983371Subject:Computer Science
Abstract/Summary:
Recently there has been a significant effort to build systems designed for large-scale data processing, or "big data." These systems are capable of scaling to thousands of nodes, and offer large amounts of aggregate processing throughput. However, there is a severe lack of attention paid to the efficiency of these systems, with individual hardware components operating at speeds as low as 3% of their available bandwidths. In light of this observation, we aim to demonstrate that efficient data-intensive computation is not only possible, but also results in high levels of overall performance.;In this work, we describe two highly efficient data processing systems, TritonSort and Themis, built using 2009-era cluster technology. We evaluate the performance of these systems and use them to set world records in high-speed sorting. Next, we consider newer, faster hardware technologies that are not yet widely deployed. We give a detailed description of the design decisions and optimizations necessary for efficient data-intensive computation on these technologies. Finally, we apply these optimizations to large-scale data-processing applications running in the public cloud, and once again set world records in high-speed sorting. We present the details of our experience with the Amazon Web Services (AWS) cloud and also explore Google Cloud Platform.
Keywords/Search Tags:Data, Efficient, Systems
Related items