Font Size: a A A

Research On Key Techniques Of Multi-Version Block-Level Backup Data Management

Posted on:2011-05-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:G J WuFull Text:PDF
GTID:1118330338989434Subject:Information security
Abstract/Summary:PDF Full Text Request
With the growth of internet and the maturity of disk technology, the volume of da-ta has experienced explosive growth. Human beings are more dependent on data information than ever before. Data corruption can be caused by all kinds of failures and disasters, such as man-made errors, virus attacks, firmware malfunctions, soft-ware defects and even site failures. Currently, the fine-granularity multi-version replication techniques have incurred widely concerns in recent years. It can provide fine-granularity data protection, compared with traditional backup techniques, and can even support High Availability service for critical business. But now, more re-searches have been contributed to replication techniques. The backup resiliency has become a challenging problem in data replication system.In this paper, we devoted to block-level backup data source. Block-level replica-tion system can provide application-independent services to decrease the price of data protection. But block-level backup data constitutes binary blocks, which are content-independent data source. The current technologies, such as multi-version indexing structures, content-based indexing technologies and disaster recovery back-up systems, can not be used to index the multi-version block-level backup data di-rectly. Our objective in this paper is to research the key technologies for increasing multi-version backup data recovery capability and efficiency. This paper is mainly composed of the following parts:First, in order to design a universal indexing method for all kinds of block-level replication technologies, some important factors in backup data management, such as block-level replication techniques, size of backup data volume, block-level backup data distribution features, and backup and recovery process are analyzed in the framework. Our research purpose is to ensure backup data recovery capability using the minimum volume of multi-version backup data. The analysis framework explains the problems in multi-version backup data management process, provides the analy-sis methods for these problems. The analysis framework provides theoretical basis for the following parts.Second, in order to adapt the frequent updating, retrieval and storage efficiency requirements for high frequency backup data, a novel indexing structure for CDP (Continuous Data Protection) backup data is proposed. We design a FBQL (Forward-and-Backward Query Log) indexing structure according to multi-version backup data time sequence analysis. FBQL could support roll backward and roll forward query in the same structure. Experiments results exposed that FBQL could potentially adapt to write-intensive application. Combining with the checkpoint technologies, FBQL can achieve application-consistency data recovery method. FBQL provides a basic index-ing structure for the next work.Third, as snapshot can be used as recovery point to ensure the correctness for re-covery in multi-version backup data management, we design and implement a merg-ing-based snapshot indexing method: HCSIM (Hierarchical Clustering Snapshot Indexing Method). The BVSM (Bitmap Vectors Snapshot Merging) algorithm is further proposed in HCSIM structure to reduce indexing redundancy caused by disk writing locality. The analysis and experiment results exposed that HCSIM could provide an efficient snapshot query method. It could also eliminate the write-skew effects and could achieve more banlance between the metadata storage and retrieval efficiency tradeoff.Fourth, for the bitmap indexing technology is low in efficiency in multi-version block-level backup data management, we present a variable block-length indexing method. It uses variable block-length as indexing unit. Based on the variable block-length indexing method, we further present the version-merging, snapshot query and version deleting algorithms. The variable block-length indexing method could reduce the size of metadata volume and accelerate snapshot query compared with the bitmap indexing method. The experiment results exposed that the variable block-length multi-version method can achieve better performance than traditional multi-version indexing method in metadata size, updating efficiency, and version operations.Fifth, combining the technologies proposed in the thesis, we design and implement disk data backup and recovery system. The system includes three parts, the client server, the metadata server and the storage server. The storage server integrates all of the technologies proposed in our thesis. Under the practical evaluations, our multi-version backup data management method achieves lower indexing complexity, and can restore any point of version backup data. It could pontentially adapt to multi-version block-level backup data management and achieve our initial research objec-tives.
Keywords/Search Tags:Data Recovery, Multi-Version Backup Data Management, Snapshot Query, Indexing Technology, Block-Level Data
PDF Full Text Request
Related items