Font Size: a A A

Research On Workflow Matching And Discovery Based On Data Unification For Proteomics

Posted on:2014-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:G M ZhaiFull Text:PDF
GTID:2208330434972186Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Proteomics data analysis is indeed a very complex and multistep process. Involving technology of scientific workflows in processing and analysis of proteomics data allows smooth processing of the data through the different stages which greatly promotes scientific discoveries in proteomics data analysis. With the development of proteomics data analysis techniques, more and more data analysis tools are rendered. As well as facing with more choices, researchers will also face with the problem of high complexity existing in the workflows construction including appropriate tools selection for each stage and parameters tuning of workflow’s execution. Therefore, how to reduce the complexity when construct the workflows of data analysis is an important issue currently facing.Although some studies have been done to model, store, and query scientific workflows and their provenance information, little is done to build practical systems to support workflow matching and discovery that will reduce the complexity of workflows construction. In order to solve the issue raised above, this paper does the following work.1. Based on the characteristics of proteomics data analysis, we applied the task-based scientific workflow model to describe proteomics data analysis process. And we also proposed a novel provenance model for requirements.2. We devised the process of provenance-based workflow matching and discovery. And we also, based on the process, designed a system framework and structured the data about workflows and their provenance information for representation, storage and management.3. In the last, we implemented the provenance-based workflow matching and discovery algorithms and initial experiments demonstrated its effectiveness.With the proposed novel provenance model and workflow matching&discovery algorithms, we implemented a provenance-based workflow matching and discovery system. As a subsystem, it was applied to the platform of CoPExplorer which belongs to one863Program under Grant No.2009AA02Z304. According to the feedback, the system can effectively reduce the complexity of data analysis workflows construction.
Keywords/Search Tags:scientific workflow, data provenance, workflow matching anddiscovery, proteomics data analysis
PDF Full Text Request
Related items