Font Size: a A A

Mpi-based Parallel Program Design

Posted on:2003-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:2208360065956005Subject:Computer applications
Abstract/Summary:PDF Full Text Request
The technology of modern computer has stimulated the rapid growth of the computational science. In practice, the limit of speed and technology has embarrassed the PC to be used widely. So it is important to study the high performance computing technology.The crucial to perform the high performance computing is to design effective parallel program. The tools to design parallel program include MPI, PVM, Linda, and so on. Because of the advantage of the MPI (Message Passing Interface), which are portal, powerful function, and high efficiency, it became the most important tool to design parallel program.The motive of parallel is to deal with the bulky data and to increase the computational speed. This paper mainly investigates the crucial factor CUR (Cache using ratio) to improve the effective speed of parallel computer. And then analyzes the subscript to be restored and the load balance, which appeared in improving the CUR.As to the hardware, the technology of hierarchical memory has used in most computer. With abstract to memory and hard disk, the speed of cache is the most high and the capability is smaller, so CUR is the crucial factor to improve the performance of whole computation. This paper utilizes the designing parallel program skill to distribute the part of computation to the cache to improve the effective speed of the high-performance scientific computing program.In the parallel computation, it is mainly to compute the basic linear calculation models. And in the calculation of the multiply of the matrix and matrix, the CUR is highest. But not all the calculation is to deal with the matrix. We can transfer the others process to matrix dealing, and when finished, restore to the former. While managing these processes, we should guarantee the correct result to berestored, which called subscript-restored. This paper gives the arithmetic of subscript-restored, and then writes the parallel program.In the arithmetic of subscript-restored, if the ratio of the number of small-matrix to the number of processor, the number of mission every processor processes is different. Thus some processor has been computing, while others are waiting. Furthermore, the time spent in different task is different. If we can adjust the number of task while running program dynamic to be balanced, the performance of system will be improved greatly. This is the other respect of the load balance. This paper gives two kinds of methods to solve the load balance: averaging weight and multilevel averaging weight.This paper mainly research the message passing interface (MPI) based on PC cluster. Firstly, it surveys the background of the MPI, and then the parallel architecture and the model of computation.Secondary, it presents the main routines in MPI, analyzes the theory of designing parallel program. MPI uses eager protocol and rendezvous protocol to send and receive data. MPI provides blocking and nonblocking communications to compute and communicate. And the nonblocking communications can implement overlap computation with communications, thus improving the performance of application. Nonblocking routines also solve the problem of controlling. Nonblocking communications allow the program to compute while waiting for the communications to complete.Collective communications allow to be communicated while computing and synchronization. MPI provides file view to access the noncontiguous data. In most parallel programs, each process communicates with only a few other processes; the pattern of communication is called an application topology or virtual topology. It used to present the order of the parallel processes in logical.Thirdly, this paper demonstrates the new development of MPI-2. MPI-2 has three large, completely new areas. They are parallel I/O,remote memory operations, and dynamic process management. Parallel I/O exploits individual file pointers and explicit offsets and shared file pointers to access noncontiguous data and collective I/O. It also uses subarrays and distributed arrays stored in f...
Keywords/Search Tags:message passing, rendezvous protocol, eager protocol, communicator, file view, parallel I/O
PDF Full Text Request
Related items