Font Size: a A A

The Research And Implementation Of Hierarchical Parallel Algorithm And MPI-2 New Features

Posted on:2010-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:S Q YangFull Text:PDF
GTID:2178360278960498Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous development of computer technology, from data processing to intelligent processing, the range of application of computers is widened increasingly, and the scale of the problem solving is also grown rapidly. In order to meet the needs of practical application problems, an important solution is the use of parallel computing technology. As a result of the quickly developing of the computer technology, the development and use of parallel computers have reached an unprecedented height. The parallel computer architecture has changed from single-core, single processor, single node to the multi-core, multi processor, multi node. The capability of inter-node communication has reached an unprecedented high performance and low latency. And the performance of the parallel computers has also improved remarkably.Therefore, in face of the current multi-core processors or even numerous processors, the hierarchical parallel programming model which is composed of the shared memory programming model and the distributed memory programming model is a leading direction. Aiming at the application of high-performace, the MPI+OpenMP hierarchical parallel programming model which is popular in hierarchical parallel programming model with the researches about this hierarchical model in the last decade has been practised. MPI is the typical representative of the message-passing programming model, and OpenMP Application Programming Interface (API) is a de-facto standard for parallel programming on shared memory architecture. With the combination of the two models, the current??ti-core parallel cluster system can make good use of its advantage.Based on the serial LSQR algorithm which is commonly used for the fields such as tomographic inversion and parameter inversion etc, the inherent parallelism of the LSQR algorithm has been explored by analyze. With using the compression stored by row for large sparse matrix, the parallelization of key computation of intensive computing parts of the serial LSQR algorithm has been implemented step by step, such as the parallel computing of the product of large sparse matrix and vector, the parallel computing of the product of the transpose of large sparse matrix and vector etc. Finally, a set of MPI-based parallel LSQR algorithm on cluster distributed system has been implemented. With the analysis of fine grain parallel of the serial LSQR algorithm and the applications of MPI+OpenMP hierarchical parallel programming model, a set of MPI+OpenMP-based hierarchical parallel LSQR algorithm has been designed and implemented. In the meantime, the performance comparsion of the MPI-based parallel LSQR algorithm and the MPI+OpenMP-based hierarchical parallel LSQR algorithm also has been done in order to analyze and verify the parallel computing performance of hierarchical parallel programming.In addition, the MPI-2 new features mainly including parallel I/O and remote memory access ("RMA"for short) have been discussed. The data access with explicit offsets mode of parallel I/O has been applied to the parallel LSQR algorithm. Through reading the same file at the same time by multi-thread, the performace of the parallel LSQR algorithm has been improved overall. And the data movement of RMA communictaion including MPI_Put, MPI_Get and MPI_Accumulate under the fence synchronization mechanisms has been analyzed by small applications for the remote memory access. The testing data has been designed in order to verify the correctness of the remote operation.The result shows that the MPI-based parallel LSQR algorithm and the MPI+OpenMP-based hierarchical parallel LSQR algorithm both have received good perfermance. Under the same computing requirement conditions, the MPI+OpenMP hierarchical parallel programming model has better performance than pure MPI parallel programming model. In addition, the combination of parallel I/O which is the MPI-2 new feature also has significantly reduced the execution time of the algorithm. Then the parallel efficiency of the algorithm has been further improved.
Keywords/Search Tags:MPI+OpenMP, Hierarchical Parallel, MPI-2, LSQR Algorithm
PDF Full Text Request
Related items