Font Size: a A A

The Textual Content Recovery Technology Study For Fragment Or Corrupted Ooxml Document

Posted on:2015-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2268330428963898Subject:Computer technology
Abstract/Summary:PDF Full Text Request
File carving technology as an effective method which doesn’t rely on file systemmeta-information for data recovery, could combine with data’s characteristics toreserve the most data of original file, largely make up blank of traditional recoverytechnology which rely on file system meta-information. However, So far the carvingtechnology mainly focuses on completed or fragment file which should not lose anydata of the original file, the research of fragment file that have some data lostadvances slow. This paper do the research of the document in Ooxml format, underthe condition that some data may be lost or the file is fragment, to do the research ofthe textual content recovery of original file.Firstly, this paper put forward a textual content recovery method for DOCXdocument based on deep analysis of structure of the DOCX document in Ooxmlformat standard. Through reading data directly from the disk or disk image to look forDOCX data fragment, we get the main document part’s data and its structure, andreorganized it in its original way, to recovery the textual content of DOCX document.The experiment result shows, this method could effectively recovery the textualcontent of the corrupted or fragment DOCX document.Secondly, this paper work out a textual content recovery method for PPTXdocument based on deep analysis of the structure of the PPTX document in Ooxmlformat standard. Through read data directly from the disk or disk image to look forPPTX data fragment, we sequentially obtains the slider part’s data and its structure,and reorganize it in its original way to recovery the textual content of PPTX document.The experiment result shows, this method could effectively recovery the textualcontent of the fragment or corrupted PPTX document.At last, this paper propose a textual content recovery method for XLSXdocument based on deep analysis of the structure of XLSX document in Ooxmlformat standard. Through read data directly from the disk or disk image, use the tail’sstructure of file to carve the original file to its longest in order to reduce the searchrange, then look for XLSX document data fragment, we acquire every sheet part andsharestring part’s data and their structure to reorganize them in their original way, weeventually recovery the textual content of XLSX document. The experiment result shows, this method could effectively recovery the textual content of the corrupt orfragment XLSX document.This paper do the research of the textual content recovery for data block ofcorrupted or fragment document in Ooxml format standard, and put forward a textualcontent recovery method by reorganizing the key part of document in its original way.The result of this issue offers a meaningful reference to the data recovery for thecorrupt data fragment.
Keywords/Search Tags:Fragment, Textual Content, Ooxml, Data Recovery, DOCX, XLSX, PPTX
PDF Full Text Request
Related items