Font Size: a A A

Research And Implementation Of High-availability MPI Parallel Programming Environment And Parallel Programming Methods

Posted on:2008-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:M XieFull Text:PDF
GTID:1118360242499347Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the progress of science and technology, scientific computing and numerical simulation are adopted in more and more disciplines for problem solving. These scientific problems often require much more computation, storage, and communication, therefore large-scale parallel computing systems have become the mainstream architecture of current high performance computing system.With the expansion of parallel computing system scale, there come the problems of scalability, and the reduction of system reliability where the mean time to failures in some very large parallel computing systems might even be several hours. Under such conditions, many large scale parallel applications cannot run efficiently and complete successfully without a high performance and fault tolerance parallel software development and runtime environment.Message passing is the mainstream programming model for developing parallel applications. With such features as flexible support for parallel algorithm implementation, high performance and good portability, MPI is now the de facto standard of message passing API. Focusing on how to improve the availability of large parallel computing systems and applications, this thesis studies some key problems in implementing high availability MPI parallel programming environment, including performance, scalability and fault tolerance. Besides, oriented to more effectively fault tolerance on future very large scale parallel computing systems, we also do some research on the designing methods and rules of efficient fault tolerant parallel algorithms in MPI programs. The main contributions of this thesis can be summarized as follows:1) Oriented to large scale parallel computing systems, the architecture of a new communication hardware interface, CNI, is proposed. Based on CNI, we implemented the Communication Express (CMEX) software interface. CMEX provides protected, concurrent and completely user-level communication operations, supports zero-copy data transfer between processes, and has good scalability with the connectionless packet transfer and RDMA communication mechanism. We also put forward some basic methods to validate CMEX communication software interface by means of static program analysis and model checking, for ensuring the correctness and reliability of CMEX implementation.2) Based on CMEX software interface and MPICH2 system, we studied the technical approaches to implement a high performance and scalable MPI parallel programming environment , MPICH2-CMEX, using RDMA communication mechanism. For improving the performance, we designed and implemented the efficient message data transfer using RDMA read and write operations. For improving the scalability, first, we proposed a dynamic feedback credit flow control algorithm. Using this algorithm, communication resources can be utilized more effectively, because resources are enlarged dynamically between tasks which have frequently message passing. Second, we proposed the methods of hybrid channel data transfer utilizing the nearest-neighbor exchange mode of many parallel applications. When the scale of parallel applications is enlarged, we can control the internal usage of communication and memory resources in MPI system, while guarantying the runtime performance of applications. Using MPICH2-CMEX, we got good speedups in many parallel application tests.3) For the fault tolerance of MPI parallel applications, we designed and implemented a user transparent system level parallel checkpointing system in MPICH2-CMEX, blocking mode coordinated checkpointing protocol is used in it. Coordinated protocol and checkpoint image storage are two major parts affecting the overhead of parallel checkpointing system. Our system utilizes the feature of near-neighbor exchange in many parallel applications and uses virtual connection technology to reduce the number of internal messages exchanged in the coordination stage, hence reduce the latency of protocol processing. Global parallel filesystem is used for storing checkpoint images, as it simplifies the management of image files and implements parallel I/O in the image storage stage. Through the experiments of some parallel applications, it is showed that this checkpointing system has small runtime overhead and is scalable, it provides good support for fault-tolerant running of many long-time parallel applications.4) Oriented to the fault tolerance of parallel applications in future very large scale parallel computing systems, we proposed a methodology for designing new fault-tolerant parallel algorithm (FTPA) in MPI parallel programs. FTPA implements fault tolerance by parallel re-computing the work of fault task. Core idea and some key problems in the implementation of FTPA are discussed, and we proposed the methods and related rules for the analysis of use-definition chains between processes. We also introduced some implementation specifics of FTPA in two parallel programs. Through some experiments in a parallel computing system, it shows that FTPA has low runtime overhead and is scalable. It will be an effective technical approach for fault tolerance of parallel applications when FTPA is used in combination with checkpointing system.
Keywords/Search Tags:High Availability, User-Level Communication, Message Passing Interface (MPI), Checkpointing/Restart, Fault-Tolerant Parallel Algorithm
PDF Full Text Request
Related items