Font Size: a A A

Looking beneath the edges and nodes: Ranking and mining scientific workflows

Posted on:2011-12-10Degree:Ph.DType:Thesis
University:Indiana UniversityCandidate:Dong, XiaoFull Text:PDF
GTID:2448390002469680Subject:Information Science
Abstract/Summary:
Workflow technology has emerged as an eminent way to support scientific computing nowadays. Supported by mature technological infrastructures such as web services and high performance computing infrastructure, workflow technology has been well adopted by scientific community as it offers an effective framework to prototype, modify and manage sophisticated computational processes. As the practitioners in various fields have created a great volume of workflow instances, dedicated repositories are made available in order to accommodate the increasing needs to store and publish workflow applications, as well as to facilitate discovery and sharing to serve the respective communities. Despite such promising development, we have also witnessed inadequacy in the corresponding workflow discovery capabilities, which limits the scope of impact for the existing workflow infrastructure and potential benefit for workflow sharing and re-using. This thesis identifies key components in workflow mining and formulates plausible ranking methods around them. More specifically, we introduce a semantic distance awareness approach that is able to identify functional and structural similarity simultaneously, therefore allowing workflow mining on a quantifiable and compelling scale. Using this approach we demonstrate ways to discover semantically related workflow examples that present novel and complementary functionalities, which would otherwise remain unveiled using traditional methods. Furthermore, we extend into the paradigm of data centric workflows and demonstrate promising utilities in life science linked open data with our approach.
Keywords/Search Tags:Workflow, Scientific, Mining
Related items