Font Size: a A A

Design And Implementation Of A Provenance Framework In Workflow System-Nebulas

Posted on:2012-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:2218330368982422Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of networks and grid technology and a wide range of scientific computing applications, data replication and transformation become increasingly frequent, More and more people encounter this problem:How came these data and is it credible? The most typical areas of interest in these questions are biological, medical, chemistry, physics, astronomy and other scientific fields, The data Quality is very important for scientists, so understand the results of a data generating process is very significant, The data that describe the data generating process is called data provenance or data lineage. In the scientific workflow management systems that developed for solving large-scale data-intensive scientific experiments, the process of data processing is transparent, manual record this information is not realistic especially in grid environment for the cross-regional joint scientific experiments, because there may be hundreds of thousands of computers participating in the experiment, recording the process information manually is impossible, which makes the data provenance increasingly important role in the scientific workflow management system. It provides workflow execution and data about the source, which greatly facilitates the assessment of data quality, fault-tolerant workflow execution, the effectiveness of the verification experiment. Therefore, developing a provenance framework to record the results of this process and source date in the scientific workflow management system is necessary.At present, there are few studies of data provenance in scientific workflow in china; this paper studied the provenance based on scientific workflow management system which developed by the field of astronomy. Design and implementation of a SOA-based data provenance collection framework, It intended to facilitate the results validate the accuracy of astronomical data and provide data analysis, at the same time the research played a substantial role in this area. In this paper, a data provenance framework has been completed, including the design and implementation of a XML-based data provenance description model and data collection, storage, query about provenance. Finally, a test have done about the collect performance for data provenance collection framework and got a good collection capacity.
Keywords/Search Tags:Scientific workflow, Nebulas, data provenance, provenance framework
PDF Full Text Request
Related items