Font Size: a A A

Dynamic data replication: An approach to providing fault-tolerant shared memory clusters

Posted on:2003-10-24Degree:M.ScType:Thesis
University:University of Toronto (Canada)Candidate:Christodoulopoulou, RosaliaFull Text:PDF
GTID:2468390011982490Subject:Computer Science
Abstract/Summary:
In this thesis we address the challenging problem of modern server systems to deal with failures transparently and to meet application-imposed requirements for continuous operation. More specifically, we deal with failures We design extensions to an existing SVM protocol that has been tuned for low-latency, high-bandwidth interconnects and SMP nodes, and we achieve reliability through dynamic replication of application shared data and protocol information. Our extensions allow us to tolerate single (or multiple, but not simultaneous) node failures. We also implement our extensions on a state-of-the-art cluster and we evaluate the common, failure-free case. We find that, although the complexity of our protocol is substantially higher than its failure-free counterpart, by taking advantage of architectural features of modern systems our approach imposes low overhead and can be employed for transparently dealing with system failures.
Keywords/Search Tags:Failures
Related items