Font Size: a A A

Designing scalable and high performance one sided communication middleware for modern interconnects

Posted on:2010-12-26Degree:Ph.DType:Dissertation
University:The Ohio State UniversityCandidate:Santhanaraman, GopalakrishnanFull Text:PDF
GTID:1448390002488995Subject:Engineering
Abstract/Summary:
High-end computing (HEC) systems are enabling scientists and engineers to tackle grand challenge problems in their respective domains and make significant contributions to their fields. Examples of such problems include astro-physics, earthquake analysis, weather prediction, nanoscience modeling, multiscale and multiphysics modeling, biological computations, computational fluid dynamics, etc. There has been great emphasis on designing, building and deploying ultra scale HEC systems to provide true petascale performance for these grand challenge problems. At the same time, Clusters built from commodity PCs are being predominantly used as main stream tools for high-end computing owing to their cost-effectiveness and easy availability.;Communication subsystem plays a pivotal role in achieving scalable performance in clusters. Of late there has been a lot of interest in one-sided communication model and they are seen as a viable option for petascale applications. The one-sided communication provides good potential for computation communication overlap. In order to provide high performance and scalability, the one-sided communication subsystem needs to be designed to leverage the advanced capabilities of the modern interconnects.;In this dissertation we study and explore various aspects of one-sided communication like zero-copy, overlap, reduced remote CPU utilization, latency hiding techniques, and non-contiguous data transfers in middleware libraries. We improved the passive synchronization design to use RDMA atomic operations that provides high overlap capability. We also proposed a hybrid design that extends the above approach to optimize intra-node communications as well. We have also explored the use of remote completion semantics for RDMA operations in InfiniBand to improve the performance of fence synchronization. To optimize non-contiguous data communication, we proposed novel zero-copy designs using InfiniBand scatter/gather operations with reduced remote CPU utilization. Designs using RDMA atomic primitives have been proposed to improve the performance of read-modify-write operations. Further we have also proposed latency hiding techniques that uses non-blocking semantics and aggregation mechanisms.
Keywords/Search Tags:Performance, Communication, Operations, Proposed
Related items