Supporting I/O for remote visualization of high-performance scientific simulations

Posted on:2004-01-24

Degree:Ph.D

Type:Thesis

University:University of Illinois at Urbana-Champaign

Candidate:Lee, Jonghyun

Full Text:PDF

GTID:2458390011953261

Subject:Computer Science

Abstract/Summary:

Scientific data generated by large-scale parallel simulations are usually visualized or post-processed on a different, often remote, platform. This distributed setup may impose three I/O performance challenges. First, simulation codes periodically write copious data to disk. Second, these data need to be migrated to the remote platform, and the network throughput is typically much lower than that of a parallel file system. Third, typical post-simulation activities involving the migrated data are read-intensive, and can be slowed down by load imbalance when these tools run with heterogeneous disks, which are common in modern clusters. Without intelligent approaches to address these challenges, I/O can be a serious performance bottleneck.; This thesis presents techniques to address these I/O performance issues. First, for efficient data migration, we propose an architecture that integrates a parallel I/O library and a migration engine. We examine the use of data compression and a novel buffering scheme with this integrated architecture, to reduce application turnaround time. We also introduce performance models for several I/O and migration methods, and show how these models can be used to control the usage of I/O and migration resources. Second, we study data declustering across heterogeneous disks. Declustering distributes data over multiple disks, enabling efficient execution of visualization queries that retrieve only the areas of interest in each data set. We show how to use virtual servers to enable easy adaptation of existing declustering approaches to a heterogeneous disk environment, and propose methods and algorithms to decide the number of virtual servers and the mapping of virtual servers to disks.; We present the results of experiments with the Panda parallel I/O library that show that our proposed approach to data migration can reduce application turnaround time significantly. We also show that our declustering approaches can reduce the retrieval time for visualization queries on heterogeneous disks, while lessening performance variance across different queries that retrieve the same amount of data.

Keywords/Search Tags:

I/O, Data, Performance, Visualization, Remote, Heterogeneous disks, Parallel

Related items

1	The Study Of Dynamic Heterogeneous Virtual Disks Array And Its Pivotal Technologies
2	Research Of Parallel Program Performance Visualization Under Heterogeneous Platforms
3	Implementing Transparent Compression and Leveraging Solid State Disks in a High Performance Parallel File System
4	The Research On The Visualization In Scientific Computing Of Remote Real-Time Tracing Visualization For The Parallel Application Program
5	High-Performance Processing Of Massive Remote Sensing Data And Its Visualization Application
6	Research And Implementation Of Data Prefetch-based Vector Line Parallel Visualization Algorithm In Large Scale Flow Field
7	Research Of Parallel Computing Visualization Based On MPI
8	Research And Implement Of Remote Interactive Visualization Technology For High Computing Data
9	Research On Remote Interactive Visualization Technologies For CAE Data
10	Research On Prefetching And Caching Technologies Of Parallel Disks System