Font Size: a A A

Parallelization and fault tolerance supporting near-real-time data processing for bioremediation

Posted on:2014-09-16Degree:Ph.DType:Dissertation
University:Colorado School of MinesCandidate:Hakkarinen, DouglasFull Text:PDF
GTID:1458390008455336Subject:Computer Science
Abstract/Summary:
The push toward interdisciplinary research has expanded the interactions and complexity of research tasks. In the course of this dissertation, we explore three separate, but ultimately related lines of research that are all tied together through the SmartGeo interdisciplinary research program, and in the vision of a system, near-REal-time Autonomous bioremediation of ConTamination in the Subsurface (REACTS).;The technical focus of this dissertation examines several of the key problems with increasing data sizes that are potentially faced by REACTS. The main strategy for handling these large data sizes and accompanying performance problems is the use of large parallel clusters. We explore the runtime performance improvement of a parallel version of the Covariance Matrix Adaptation - Evolutionary Strategy (CMA-ES) algorithm that REACTS could use for the processing of self-potential data collected in the field. As the size of clusters may potentially be sizable, we also examine methods for fault tolerance for higher performance computing (HPC) systems. In the process, we contribute two additional methods for fault tolerance for HPC systems: Multi-level Diskless Checkpointing and Algorithmic Based Fault Tolerance for Cholesky Factorization.;Additionally, as part of the SmartGeo program, we examine a policy topic relevant to the development of software in a grant such as SmartGeo. In particular, we examine the relationship between the National Science Foundation's Broader Impacts criterion and the development of Open Source Software.;Through this interdisciplinary work, we move forward the building blocks toward making feasible an intelligent bioremediation system, such as described by the REACTS vision.
Keywords/Search Tags:Fault tolerance, REACTS, Data
Related items