Font Size: a A A

Research On Optimizing Parallel I/O Operations In Cluster Environment

Posted on:2017-04-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:W F LiuFull Text:PDF
GTID:1108330485479144Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of high performance computing systems, many scientific applications, such as astrophysics, nanotechnology, high-energy physics, and climate forecast, are getting more and more data intensive. Although scientific applications usually run on supercomputers that use high-speed parallel file systems as I/O subsystems, many of them only achieve low I/O bandwidth. The main reason for this is that most parallel file systems are optimized for large contiguous data accesses, but in many parallel applications, each process accesses a large number of small and noncontiguous data pieces. At the same time, in order to reach exascale computing, more and more efforts are made to improve the energy consumption and efficiency in high performance computing systems.Parallel applications divide their workloads into sub-tasks and assign them to different processes. The processes get the shared data by parallel I/O operations. There are two kinds of parallel I/O operations which are non-collective I/O and collective I/O. Non-collective I/O is usually used by applications sharing mass of data between loosely coupled processes, such as parallel rendering. For these applications, different processes get the needed data individually and there are no communications among them. On the contrary, collective I/O is usually used by applications sharing a single file between tightly coupled processes. For these applications, all processes cooperate with each other to finish one I/O operation.The relative low I/O speed makes the processor spend lots of time waiting for the data, so the computing capability and energy are wasted. For the above reasons, improving the speed and energy efficiency of parallel I/O operations is getting more and more important. This thesis do some research about parallel I/O. Based on current research result, we propose some methods to improve the speed of non-collective and collective I/O operation, and we also try to improve the energy efficiency of parallel I/O operation. The main contributions of this thesis are listed below.First, we propose a method to improve the performance of non-collective I/O operation. For the loosely coupled applications, the processes running on different nodes share lots of files, we design a self-organizing distributed memory cache running in the cluster environment which caches the shared files, and all processes can visit the cache transparently. So once a file is loaded, all the MPI processes of the working unit can bypass the centralized filesystem and get the data directly from the distributed memory cache. The performance of our distributed caching system has been proved to be sufficiently good. We show the architecture of the distributed memory cache and test its performance.Second, we propose a tuning strategy for collective I/O operation. The optimization mechanism of collective I/O provides some tuning parameters. For tuning of collective I/O operations, a naive strategy is to execute an application using all possible combinations of tuning parameters. An efficient strategy is needed. We build a model which can describe the execution procedure of collective I/O and design a new tuning strategy based on the model. The strategy can help us find a good enough configuration within a reasonable time.Third, we propose a method which can improve the energy efficiency of parallel I/O operation. In order to reach exascale computing, more and more efforts are made to improve the energy consumption and efficiency in high performance computing systems. Getting the energy consumption information of parallel applications is critical for improving the energy efficiency of supercomputers. However, current energy measurement tools cannot automatically setup and run in cluster environment, and this problem needs to be addressed. With the help of a distributed measuring framework which can collect all nodes’energy consumption without the aid of power meters, it is possible to get the detailed energy information of a parallel application. Using this tool, it is viable to find out parameters that affect a parallel program’s energy efficiency and to build the model of energy consumption. We can optimize the energy efficiency based on the models.
Keywords/Search Tags:High Performance Computing, Parallel I/O Operation, Performance Modeling, Energy Efficiency Optimization
PDF Full Text Request
Related items