Font Size: a A A

Improving cluster performance through the use of programmable network interfaces

Posted on:2004-10-18Degree:Ph.DType:Dissertation
University:The Ohio State UniversityCandidate:Buntinas, Darius TomasFull Text:PDF
GTID:1468390011464834Subject:Computer Science
Abstract/Summary:
Cluster computing systems are becoming increasingly popular computing environments for day-to-day computational needs because they are cost-effective and affordable. While clusters are considerably less expensive than massively parallel processors (MPPs), MPP communication performance is typically much better than cluster communication performance. Some modern network interface controllers (NICs) have programmable processors which can be used to offload communications processing from the host processor.; Process skew is inherent in cluster communication systems. Some processes may be delayed, relative to other processes, due to various unavoidable causes. Many collective communication operations are implemented in a manner in which all participating processes need to perform the operation in order for the operation to proceed. This means that if one process is delayed, it may cause other processes to be delayed when performing a collective operation.; This dissertation investigates the use of programmable NICs to improve cluster performance. We approach this problem by focusing on improving the performance, scalability, and tolerance to process skew of synchronization operations and collective communication operations through the use of NIC-based operations and NIC-based primitives.; NIC-based operations and primitives improve the performance of cluster systems. Latency is improved in some operations by performing the operation directly at the NIC and avoiding sending messages over the slow I/O bus. Host processor utilization is improved because host processor involvement in the operation is reduced. This also allows computation to be overlapped with the operation. NIC-based operations are also less sensitive to process skew.; To demonstrate the effect of NIC based operations, we have designed and implemented NIC-supported broadcast/multicast, barrier synchronization, reduction and atomic remote memory operations, as well as a application-bypass broadcast on Myrinet/GM and Myrinet/FM. The NIC-supported implementations improved the performance of the operations over the conventional host-based implementations. For instance, our NIC-based barrier operation showed improved latency by a factor of improvement of up to 2.22. The NIC-based reduction operation showed improved host processor utilization by a factor of improvement of up to 2.7. Our NIC-supported application-bypass broadcast showed a factor of improvement of up to 16 in terms of host utilization in the presence of process skew.
Keywords/Search Tags:Cluster, Performance, Process skew, Host, Operations, Programmable
Related items