Font Size: a A A

Beehive: Application-driven systems support for cluster computing

Posted on:1998-09-20Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Singla, AmanFull Text:PDF
GTID:1468390014477318Subject:Computer Science
Abstract/Summary:
A workstation cluster is a viable low-cost computing platform on which to implement high performance parallel applications. The Beehive system uses applications as a driver to develop systems support for cluster computing, targeting both functionality and high availability as well as performance and scalability. Beehive is a software distributed shared-memory architecture that provides flexibility in the memory system.; An important attribute in the specification of many compute-intensive applications is "time." There is a mismatch between the synchronization and consistency guarantees needed by such applications (which are temporal in nature) and the guarantees offered by current shared-memory systems. Consequently, programming such applications using standard shared-memory style synchronization and communication is cumbersome. Furthermore, such applications offer opportunities for relaxing both the synchronization and consistency requirements along the temporal dimension. In Beehive, we develop a temporal programming model that is more intuitive for the development of applications that need temporal correctness guarantees. This model embodies two mechanisms: "delta consistency"--a novel time-based correctness criterion to govern the shared-memory access guarantees, and a companion "temporal synchronization"--a mechanism for thread synchronization along the time axis. These mechanisms are particularly appropriate for expressing the requirements of interactive application domains.; In addition to the temporal programming model, we develop efficient explicit communication mechanisms that aggressively push the data out to "future" consumers to hide the read miss latency at the receiving end.; In Beehive, we also support dynamic reconfiguration to aid high availability of an application in the face of "hard" and "soft" node failures in the system. Each system module is designed with mechanisms for robustness in the event of failures. Our approach does not aim towards achieving transparent fault tolerance by enforcing a specific policy in the system. Instead it relies on co-operation among the various system layers as well as between the system and the application. System layers co-operate with one another for failure recovery by promoting crash-awareness through their interfaces. The application itself is expected to be robust i.e. the system seeks co-operation from the application for the recovery of both the data and the computation upon failures. Beehive exports a special crash-aware API for this purpose. This approach is useful in keeping the cost of providing high availability very low.; Using a virtual environment application as the driver, we show the efficacy of the proposed mechanisms in meeting the real-time and reliability requirements of such applications.
Keywords/Search Tags:Application, System, Beehive, Cluster, Mechanisms, Support
Related items