The Research On Efficient Store And Retrieval Provenance In Provenance-aware Storage System

Provenance-aware storage system is a kind of storage system that can automaticallycollect and manage object’s provenance, it is a revolution of storage system. It stores notonly object (file, process, pipe) itself, but also responsible for maintain provenance.Provenance is a kind of metadata that can describe the detailed historical information ofobject, it enhance the value of digital information. However, with the explosive growth ofdigital information, storage systems face enormous challenges in storing digitalinformation. The amount of provenance is ten times of digital data. How to efficientlystore and retrieve provenance is an urgent problem.To achieve the goal of efficiently store and retrieval provenance, analyzed theprinciple of provenance-aware storage system, researched the type of provenance and themethod of collecting provenance. Provenance-aware storage system collect provenanceof all objects in the system, including file, process and pipe. Among them, theprovenance of objects that does not affect the files on the provenance-aware storagevolumes is without value, and the amount of provenance that without value is severaltimes of the provenance that with value. Therefore, extracted the provenance that thesystem collected, eliminated the provenance that without value. Greatly reduced the needof storage space. Provenance is semi-structured, analyzed the storage and retrievaltechnology of semi-structured data, propose using directed acyclic graph to presentprovenance, analyzed the algorithm of detect and eliminate cycles and cycle avoidance.Studied the storage format of provenance, proposed the centralized storage method tostore provenance. Presented three solutions to store and retrieval provenance,respectively based on graphical database,Berkeley DB database and flat file.Theoretically analyzed the differences among these three solutions in performance.Design and implementation store and retrieve provenance with Berkeley DB and flat file. Finally, tested the system at local storage system. Analyzed the correctness after theextraction of provenance, ensure the integrity of valuable provenance. Tested the storagespace of provenance after the extraction. Then analyzed the performance of database andflat file in storing, retrieving and updating provenance.
Keywords/Search Tags:provenance, provenance-aware, provenance storage, provenance retrieval
