Font Size: a A A

A failure index for high performance computing applications

Posted on:2013-10-19Degree:Ph.DType:Dissertation
University:Louisiana Tech UniversityCandidate:Chandler, Clayton FFull Text:PDF
GTID:1452390008963714Subject:Statistics
Abstract/Summary:
This dissertation introduces a new metric in the area of High Performance Computing (HPC) application reliability and performance modeling. Derived via the time-dependent implementation of an existing inequality measure, the Failure index (FI) generates a coefficient representing the level of volatility for the failures incurred by an application running on a given HPC system in a given time interval. This coefficient presents a normalized cross-system representation of the failure volatility of applications running on failure-rich HPC platforms. Further, the origin and ramifications of application failures are investigated, from which certain mathematical conclusions yield greater insight into the behavior of these applications in failure-rich system environments.;This work also includes background information on the problems facing HPC applications at the highest scale, the lack of standardized application-specific metrics within this arena, and a means of generating such metrics in a low latency manner. A case study containing detailed analysis showcasing the benefits of the FI is also included.
Keywords/Search Tags:Performance, Application, HPC, Failure
Related items