Font Size: a A A

LambdaRAM: A high-performance, multi-dimensional, distributed cache over ultra-high speed networks

Posted on:2010-12-25Degree:Ph.DType:Dissertation
University:University of Illinois at ChicagoCandidate:Vishwanath, VenkatramFull Text:PDF
GTID:1448390002990194Subject:Computer Science
Abstract/Summary:
Interactive, real-time exploration and correlation of multi-terabyte and petabyte datasets from multiple sources are critical to advancing scientific discovery in many disciplines, including climate modeling and prediction, biomedical imaging, geosciences, high-energy physics and homeland security. These data-intensive applications are now being enabled by the OptIPuter, a new paradigm that relies on multi-gigabit photonic networks to interconnect distributed storage, computing and visualization resources, thereby creating a planetary-scale supercomputer. Critical performance bottlenecks have been the access latencies associated with reading and/or writing data from/to storage systems and clusters, whether connected via local or wide-area networks. In the case of climate modeling and analysis, NASA's Modeling, Analysis, and Prediction (MAP) project noted that these latencies result in "...the supercomputers remaining idle for 25-50% of the execution time during the analysis phase." Reducing I/O latency would allow researchers to run more complex models during the same timeframe, resulting in faster and more accurate weather prediction. LambdaRAM is a high performance, multi-dimensional, distributed cache that harnesses the memory of nodes in one or more clusters interconnected by ultra-high-speed networks, provided upto a 5-fold improvement. LambdaRAM's memory and data management capabilities enable applications to rapidly stride over multi-dimensional, multi-terabyte scientific datasets. It employs novel proactive latency mitigation heuristics, including presending and prefetching, based on the access patterns of an application, and data transfer protocols designed for high-bandwidth networks to mitigate data access latencies. Using LambdaRAM, climate analysis applications to compute wind shear, surface temperature and ozone thickness are able to rapidly stride through multi-terabyte datasets over both local and wide-area networks and achieve up to 20-fold improvement in performance.
Keywords/Search Tags:Over, Networks, Performance, Data, Multi-terabyte, Distributed, Multi-dimensional, Lambdaram
Related items