Font Size: a A A

Research On Technologies For MPI Parallel Code Generation And Communication Optimization

Posted on:2013-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:D Z ChenFull Text:PDF
GTID:2248330395980523Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The emergence of grand challenging problems in scientific and engineering computing andthe innovation of microelectronics technology firmly push forward the development ofhigh-performance computers. With the popularization and application of high-performancecomputers, the MPI parallel mode has become one of the important programming methodswhich are generally supported by high-performance computing platforms. Therefore, MPIparallelizing compilation is a significant way to effectively solve the problems of developingefficient parallel programs and inheriting classic serial software.This thesis has conducted an in-depth study of the related technologies of the parallel codeautomatic generation in MPI parallelizing compilation, and the main contents include codegeneration and communication optimization. The major work of this thesis is summarized asfollows:1. The development history of the MPI parallelizing compilation field is detailedly reviewed,the key technologies of MPI parallelizing compilation are comprehensively introduced, and theresearch status of the code generation algorithm and communication optimization algorithm isdeeply analyzed.2. A code generation algorithm based on data uni-distribution is proposed. The currentlytypical MPI parallelizing compilation system generates the MPI codes from the perspective ofdata redistribution, but a great deal of scattering and gathering communication overheads resultin the low speed-up ratios of MPI programs. To address this issue, according to the given parallelloop set and communication data set based on data uni-distribution, the code generationalgorithm based on data uni-distribution does basic MPI function transformation, input-outputfunction transformation, parallel loop transformation, and communication functiontransformation in the abstract syntax trees and symbol tables of serial intermediate codes, andfinally enables the MPI parallelizing compilation system to generate the MPI codes containingmore accurate communications. Experimental results show that this algorithm can reduce thecommunication overheads of MPI programs to a large extent and improve their speed-up ratiossignificantly.3. A communication optimization algorithm based on merging mpi_alltoallv is proposed.Existing communication optimization algorithms based on reducing communication overheadscan not obtain the MPI programs of reduced mpi_alltoallv communication overheads byoptimizing the mpi_alltoallv communication in MPI parallelizing compilation. To address thisissue, according to the given mpi_alltoallv communication data set, the communicationoptimization algorithm based on merging mpi_alltoallv does sending element number arraymerger, receiving element number array merger, sending buffer loop merger, mpi_alltoallvtransformation, receiving buffer loop merger on the abstract syntax trees of serial intermediatecodes, and finally enables the MPI parallelizing compilation system to generate the MPI codescontaining less mpi_alltoallv communication times. Experimental results show that this algorithm can lower the mpi_alltoallv communication times of MPI programs effectively, reducethe communication overheads, and improve their speed-up ratios significantly.4. A communication optimization algorithm based on program transformation is proposed.Existing communication optimization algorithms based on hiding communication can not obtainthe MPI programs of perfect speed-up ratios by hiding the point-to-point non-blockingcommunication in MPI parallelizing compilation. To address this issue, according to the giveninterprocedural data dependence set and the transformation rule based on control dependence,the communication optimization algorithm based on program transformation makes an orderlyuse of reordering transformation and loop distribution for the subtrees of mpi_wait andmpi_irecv call sentences on the abstract syntax trees of serial intermediate codes, expands thecommunication-computation overlap windows of point-to-point non-blocking communication assafely as possible, and finally enables the MPI parallelizing compilation system to generate theMPI codes containing overlapping communication with more computation. Experimental resultsshow that this algorithm can hide more point-to-point non-blocking communication overheads ofMPI programs and improve their speed-up ratios significantly.
Keywords/Search Tags:parallelizing compilation, message passing, code generation, communicationoptimization, computation decomposition, data distribution
PDF Full Text Request
Related items