Font Size: a A A

Representing meaningful provenance in scientific workflow systems

Posted on:2008-01-24Degree:M.SType:Thesis
University:University of WyomingCandidate:Bryant, Miranda AFull Text:PDF
GTID:2448390005450238Subject:Computer Science
Abstract/Summary:
Data Provenance has been a major issue for Scientific Workflow Systems in the past several years. Provenance of data can be considered to be of two forms, originally proposed by Buneman et al. Where Provenance for data can be considered to be related to the origination of data, such the original source. In contrast, Why Provenance is related to the lifetime of the data and all of the operations and changes made to the data over the span of use.; The business community has been the main source of inspiration for how to handle provenance, but it has not considered the primarily data-centric side of scientific workflows. In contrast to the focus on data in science workflows, business workflows are generally based on the control flow.; The database community has also tried to address the issues relating to data provenance. Approaches that try to solve these issues, such as secondary databases and inversions, need modifications as the assumptions differ when applied to systems that are not database-oriented.; Our focus is based on trying to model the data provenance, not just at the user-level, but at the intermediate and data level. By focusing on the workflow where it relates to the underlying structure, we can create a more compact representation of provenance at the data level, a more formalized provenance representation at the intermediate level, as well as display more intuitive data provenance at the user level. We are also focusing on trying to define more clearly what data provenance is when applied to pipelined workflow systems. This approach will allow the data provenance generated in research to be more compact, formal, and usable.
Keywords/Search Tags:Provenance, Data, Workflow, Systems, Scientific
Related items