Font Size: a A A

Asynchronous checkpointing and recovery approach for distributed systems

Posted on:2010-07-11Degree:M.SType:Thesis
University:Southern Illinois University at CarbondaleCandidate:Vasireddy, RahulFull Text:PDF
GTID:2448390002481761Subject:Computer Science
Abstract/Summary:
In this work, we present a high performance recovery algorithm for distributed systems in which checkpoints are taken asynchronously. It offers fast determination of the recent consistent global checkpoint (maximum consistent state) of a distributed system after the system recovers from a failure. The main feature of the proposed recovery algorithm is that it avoids to a good extent unnecessary comparisons of checkpoints while testing for their mutual consistency. The algorithm is executed simultaneously by all participating processes, which ensures its fast execution. Moreover, we have presented an enhancement of the proposed recovery idea to put a limit on the dynamically growing lengths of the data structures used.
Keywords/Search Tags:Recovery, Distributed
Related items