Research On Optimizing Parallel I/O Operations In Cluster Environment

Posted on:2017-04-13

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W F Liu

Full Text:PDF

GTID:1108330485479144

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

With the development of high performance computing systems, many scientific applications, such as astrophysics, nanotechnology, high-energy physics, and climate forecast, are getting more and more data intensive. Although scientific applications usually run on supercomputers that use high-speed parallel file systems as I/O subsystems, many of them only achieve low I/O bandwidth. The main reason for this is that most parallel file systems are optimized for large contiguous data accesses, but in many parallel applications, each process accesses a large number of small and noncontiguous data pieces. At the same time, in order to reach exascale computing, more and more efforts are made to improve the energy consumption and efficiency in high performance computing systems.Parallel applications divide their workloads into sub-tasks and assign them to different processes. The processes get the shared data by parallel I/O operations. There are two kinds of parallel I/O operations which are non-collective I/O and collective I/O. Non-collective I/O is usually used by applications sharing mass of data between loosely coupled processes, such as parallel rendering. For these applications, different processes get the needed data individually and there are no communications among them. On the contrary, collective I/O is usually used by applications sharing a single file between tightly coupled processes. For these applications, all processes cooperate with each other to finish one I/O operation.The relative low I/O speed makes the processor spend lots of time waiting for the data, so the computing capability and energy are wasted. For the above reasons, improving the speed and energy efficiency of parallel I/O operations is getting more and more important. This thesis do some research about parallel I/O. Based on current research result, we propose some methods to improve the speed of non-collective and collective I/O operation, and we also try to improve the energy efficiency of parallel I/O operation. The main contributions of this thesis are listed below.First, we propose a method to improve the performance of non-collective I/O operation. For the loosely coupled applications, the processes running on different nodes share lots of files, we design a self-organizing distributed memory cache running in the cluster environment which caches the shared files, and all processes can visit the cache transparently. So once a file is loaded, all the MPI processes of the working unit can bypass the centralized filesystem and get the data directly from the distributed memory cache. The performance of our distributed caching system has been proved to be sufficiently good. We show the architecture of the distributed memory cache and test its performance.Second, we propose a tuning strategy for collective I/O operation. The optimization mechanism of collective I/O provides some tuning parameters. For tuning of collective I/O operations, a naive strategy is to execute an application using all possible combinations of tuning parameters. An efficient strategy is needed. We build a model which can describe the execution procedure of collective I/O and design a new tuning strategy based on the model. The strategy can help us find a good enough configuration within a reasonable time.Third, we propose a method which can improve the energy efficiency of parallel I/O operation. In order to reach exascale computing, more and more efforts are made to improve the energy consumption and efficiency in high performance computing systems. Getting the energy consumption information of parallel applications is critical for improving the energy efficiency of supercomputers. However, current energy measurement tools cannot automatically setup and run in cluster environment, and this problem needs to be addressed. With the help of a distributed measuring framework which can collect all nodesâ€™energy consumption without the aid of power meters, it is possible to get the detailed energy information of a parallel application. Using this tool, it is viable to find out parameters that affect a parallel programâ€™s energy efficiency and to build the model of energy consumption. We can optimize the energy efficiency based on the models.

Keywords/Search Tags:

High Performance Computing, Parallel I/O Operation, Performance Modeling, Energy Efficiency Optimization

PDF Full Text Request

Related items

1	Research On Performance Prediction And Energy Efficiency Optimization Of HPC Programs
2	Compiler Techniques for High Performance Computing, Energy Efficiency, and Resilience
3	Technologies For Energy-efficient And High-Performance GPGPU Computing
4	Improving power and performance efficiency in parallel and distributed computing systems
5	Modeling Of High Performance Computing On Many-core Processors
6	Research On Automatic Generation Of Analytical Performance Model For Parallel Program
7	Design And Research On A Parallel Performance Data Collection,Representation And Analysis Framwork For The SMP-Cluster Architecture
8	Energy-aware Task Scheduling Algorithms And Application For High-performance Computing
9	Modeling and optimization of high-performance many-core systems for energy-efficient and reliable computing
10	Research On Key Issues Of Performance Optimization In High Performance Computing Based On The Godson