Font Size: a A A

Design Of The Digital Organism Database Fault Tolerance Mechanism

Posted on:2010-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:K YangFull Text:PDF
GTID:2208360308466305Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Non-centralized Distributed System which has no single failure and provides services over WAN with highly scalability is becoming the research focus in recent years. However,most of its applications are confined to file sharing which has low requirement for fault-tolerance. Consequently,the fault-tolerant mechanism in this kind of system hasn't been studied well,but it has to be done the same as in the traditional distributed system in order to support more applications.The Digital Organism DataBase System is a kind of DataBase system that owns the property of organism,which is based on distributed parallel system developed by 8010 Lab,UESTC for many years. The dissertation integrated with fault-tolerance,distributed principle,parallel processing as well as networking technology,proposed several novel fault-tolerant mechanisms which especially catered for the Digital Organism System's characteristics.Recovery from node failures is a critical issue in Digital Organism DataBase System. When some failures happen,the database can recover to a consistent state and continue its service with the help of recovery system. Moreover,a database node also requires a recovery process during its startup session,by which it can get consistent with other running nodes in the system.Among the various recovery techniques,log-based recoveries grow popular for their reliability and tolerable overhead. However,in conventional log-based recovery protocols,the nodes providing recovery service may still be overburdened,especially when the recovery is resource consuming. As a result,not only the system performance is compromised,but also the possibility of large-scale failure increases. In this paper,we present an dynamic recovery protocol. The key idea of this new protocol is to cache new database operations during recovery. All these cached operations can then be replayed independently later. The analysis indicates that the new protocol can improve recovery speed by reducing disk I/O and minimize internode's dependency during recovery. Therefore,system failure rate is cut down and the overall performance gets improved.
Keywords/Search Tags:Digital Organism DataBase System, fault-tolerance, recovery
PDF Full Text Request
Related items