Font Size: a A A

Study On Hybrid Parallel Molecular Dynamics Computing On Multicore Cluster

Posted on:2013-02-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:M Z BaiFull Text:PDF
GTID:1228330395474795Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
High performance computing has received great research interests because of therapid development of the high performance computing resources. The mainstreamarchitecture in high performance computing has shifted from massively parallelprocessors to multicore clusters and consequently the systems changed from singlememory model to hybrid memory model. The parallel programs which have beendesigned to run on high performance computers must follow to this change. This leadsto the emergence of the hybrid parallel programing models. Molecular Dynamics (MD)simulation has been extensively utilized in many scientific research fields and providedto be an important scientific research method. Enhancing the computing speed of MDsimulation on the multicore clusters will certainly benefit many scientific researchcommunities and therefore developing more efficient parallel MD algorithms becomeattractive. Designing parallel MD algorithms or other algorithms based on hybridparallel programing model for multicore clusters is usually fulfilled by introducingmultithreading technology, which is expensive. The high cost in turn makes the hybridmodel be inferior to the original message passing model. Solving the high cost problem,therefore is an important issue.This thesis examines the hybrid parallel programming models, hybrid parallel MDalgorithms and analyses their deficiencies in a comprehensive and systematical way.The thesis proposes a set of related improvement and optimization algorithms for thehybrid parallel MD computing.The main contributions of this these are summarized in the follows:(1) Examine and compare different hybrid parallel programing models and parallelMD algorithms. This provides the fundamentals for the multithreading and hybridparallel MD algorithms proposed in later chapters.(2) Justify the scalability of the "Critical Section" multithreading MD algorithm:the theoretical analysis and experimental results show that the speedup of the CritircalSection algorithm decreases significantly with increasing number of cores. Based onthis, we propose an optimized Triangle method to enhance the computing speed. In this method atoms are distributed unequally to threads. This statically makes them enteringcritical region one by one and consequently decreases the spare time in critical regionand speeds up the computing.(3) Proposes an OpenMP parallel MD algorithm, named SPMD-like (SingleProgram Multiple Data) method. The SPMD-like method allows the threads to deal withtheir own data and compute the crossing data relations redundantly. The scheme is alsoused in SPMD message passing programs. SPMD-like method is a simpleimplementation and does not need the modification on the inner compute logic of MD.It only needs few modifications of data structures and a spatial decompositionsubroutine. SPMD-like method has the same parallel computing performance andscalability as the pure message passing model, but with a lower workload inimplementation with OpenMP.(4) Proposes a hybrid MPI/OpenMP parallel MD algorithm for multicore clusters.This method embeds the SPMD-like algorithm into an MPI paralleled MD program.The hybrid parallel program takes OpenMP multithreading scheme in the node and has alower multithreading cost. In addition, the communication time between nodes has beenreduced significantly and therefore the performance and efficiency of MD programs onthe multicore clusters have been enhanced efficiently.(5) Proposes a piece-round-reducing algorithm, which completely keeps awayfrom the critical section. Piece-round-reducing method is as simple as Critical Sectionmethod but has a much better performance and scalability. The theoretical analysis andexperimental results show that the method has a good performance when the number ofcores is16or smaller, but has a lower performance than SPMD-like when the numberof cores is larger than16. This implies that piece-round-reducing method is a goodapproach for the hybrid implementation with a lower number of cores. On the otherhand, SPMD-like method is good approach for the hybrid implementation with a highernumber of cores.(6) Proposes a hybrid MPI/TBB parallel MD algorithm, and applies it to LAMMPS.The experimental test results show that the number of nodes in multicore cluster ishigher, the hybrid MPI/TBB paralleled LAMMPS has a better performance than MPI model. This mainl y resul ts fro mthe reduction of c ommunication time.
Keywords/Search Tags:hybrid parallel programming model, multi-core cluster, moleculardynamics, MPI (Message Passing Interface), OpenMP (Open Multi-Processing)
PDF Full Text Request
Related items