The performance mismatch between storage subsystems and microprocessors in computer systems forms a bottleneck in high-performance computing. The causes for the mismatch are the lower bandwidth and higher latency of hard disk drives as compared to main memory. Three techniques---prefetching, write-behind, and parallelism---are utilized to solve this problem.; In this thesis, we design and implement a user-level Parallel Disk Input-Output Library (PDIOL). The goal of PDIOL is to improve the performance of sequential applications through the parallelization of I/O operations across all workstations in a cluster. Prefetching and write-behind are used in PDIOL as well. We evaluate the performance of PDIOL with a suite of application benchmarks, which include grep, sort, and bzip2. From the results, we find that I/O-intensive applications benefit most while computation-intensive applications benefit least, which is consistent with our intuition. |