Font Size: a A A

Design And Implementation Of Snapshot For Distributed Database HBase

Posted on:2012-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:C X LiFull Text:PDF
GTID:2178330332476009Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and internet applications, the amount of data has been showing explosive growth in recent years. Cloud computing and distributed systems have become the major trend to process big data. As one of the distributed systems, distributed database provides random, real-time access for large structured data. Comparing with stand alone system, distributed database has the advantages of high performance, high reliability, low cost and easy expansibility, so it has been widely used in large internet applications.Snapshot is a consistent mirror of the system. It is created in a short time while the system is still running. As a complete state of the system, snapshot could not only be used in rapid backup and recovery, but also extensively used in other scenarios like load balance and application tests. Most current database and other storage systems have implemented snapshot, but in distributed database, snapshot has not been implemented as a primary functionality.Based on the understanding of distributed database details, this paper designs and implements the snapshot for distributed database HBase for the first time. The key problems in the process of snapshot have been analyzed, such as message passing, system synchronization mechanism, concurrency control and exception handling. Based on the characteristics of underlying file system and log system, we could not only create the snapshot in a relatively short period of time, but also make the storage cost and system impact of snapshot the lowest. At the same time, we also present the design of snapshot usage. With snapshot restoring, the system could be recovered to the moment when the snapshot is created. And with export and import, it provides us a new method of physical backup for distributed database without system halting. The experiments show that our design has met the expectations.
Keywords/Search Tags:Distributed Database, HBase, Snapshot, ZooKeeper, Restore
PDF Full Text Request
Related items