Font Size: a A A

The Research On HBase Data Recovery Technology Based On Data Storage Characteristic

Posted on:2018-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:L ZengFull Text:PDF
GTID:2348330515966721Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an important means of data storage and management,database has been more and more widely used in all walks of life.So the database forensics has become one of research hotspots in the field of digital forensics.Among them recovering deleted data is an important part of database forensics.At present,the researches of database forensics mainly focus on relational database and few paper is about Nosql database.In this paper,we aim at HBase,a widely used Nosql and distributed database,to research that how to recover HBase records effectively and completely.The main work is as follows:Firstly,according to the storage characteristics of HDFS,this paper proposes a method of recovering HBase records based on the corresponding checksum file.The method uses file carving technology to restore the Data blocks of HBase data files from disk image.Then according to the structure of Data block and the format of the record to extract corresponding table records from the recovered Data blocks.A diffi cult problem of file carving is the file fragmentation.This method uses the checksum file to identify file fragments.We make experiments under three diffenent situations,than is the disk cluster size is 4KB,2KB and 1KB respectively.The experiments demonstrate that our method can restore HBase records effectively.The recovery rate is nearly 100% when the cluster size is 4KB and 2KB.When the cluster size is 1KB,the lowest recovery rate is 83.61%.Secondly,according to the characters of write-ahead log storage structure,this paper proposes a method to recover HBase records based on HBase write-ahead log.On the basis of analyzing the structure of write-ahead log,we use the sync to identify the fragments of log file,then use sequence number to sort the identified fragments in order to reconstruct the log file.Finally,we analyze the log entry format to restore corresponding records.As the same,we make experiments under three kinds of disks.The experiments demonstrate that when the log file was not overwriting,we can recover all the records under 4KB and 2KB cluster sized disks.The effectiveness under 1KB cluster sized disk is a little poorer than the other two situations.The recovery rate is 96.22%.When the log file is partly overwrited,we can recover the records from the rest part as much as possible.In this paper,the HBase database data recovery technology are studied,the research result has a certain contibution to the further research in the filed of HBase database forensics.
Keywords/Search Tags:HBase, HDFS, data recovery, HFile, write-ahead log
PDF Full Text Request
Related items