Font Size: a A A

Optimization Of Molecular Dynamics Simulaiton Algorithm On Sunway Supercomputers

Posted on:2022-03-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:P GaoFull Text:PDF
GTID:1488306608477434Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
Molecular dynamics(MD)simulation is a powerful computational method in diverse research fields such as physics,chemistry,biology,bioengineering,and medicine.It solves Newton's equations of motion for a system of interacting particles to research micro characteristics on an atomic or molecular scale.Since MD simulation is to establish a virtual mathematical model to describe the interactions between particles at the atomic and molecular level,MD simulation can carry out scientific research that cannot be done in ordinary laboratories.Moreover,MD can observe tiny experimental phenomena in a very short time.However,with the development of MD models,the simulation process has become more complex,which makes the corresponding requirements for supercomputers,such as computing power,memory,and network,also increase rapidly.On the other hand,while supercomputers are suited to solving computationally intensive problems such as MD simulations,the need for practical research is growing.Only when the number of atoms is infinite and the simulation time is long enough,the macroscopic characteristics of material properties can be truly reflected.That leaves scientists looking forward to furthering scale-ups,both in space and time.Therefore,it is imperative to optimize the relevant simulation algorithm.The main work is to optimize MD simulation algorithms on domestic heterogeneous many-core supercomputers based on MD software Large-scale Atomic/Molecular Massively Parallel Simulator(LAMMPS):Firstly,a large-scale parallel chemical reaction simulation is performed using the Reactive Force Field(ReaxFF)in LAMMPS on the Sunway TaihuLight supercomputer.In this work,we have carefully redesigned the force analysis and neighbor list building steps.The redesign of the neighbor list eliminates the problem of write conflict in the calculation of non-bonded interactions and makes it easy to perform parallel computation.By applying fine-grained optimizations we gain better single process performance.For the many-body interactions,including valence and torsion angles,we propose an isolated computation and update strategy and implement inverse trigonometric functions.For Charge Equilibration(QEq),we implement a pipelined preconditioned conjugate gradient(PCG)approach to achieving better scalability.This is also one of the key problems that need to be solved in large-scale parallel computing.Furthermore,we reorganize the data layout and implement the update operation based on data locality in ReaxFF.The experiments show that this approach can simulate chemical reactions with 1,358,954,496 atoms using 4,259,840 cores with a performance of 0.015 ns/day.To our best knowledge,this is the first realization of chemical reaction simulation with a millimeter-scale force field based on MD software LAMMPS.Then,for the interactions in ReaxFF,an in-depth and fine-grained optimization is carried out on the Sunway TaihuLight supercomputer according to the pipeline of "refactor-parallelization-vectorization".Firstly,the data structure and algorithm are refactored.Then,we analyze the computation of virial stress and develop a vectorized implementation for nonbonded interactions,which is nearly 100 × faster than the management processing element(MPE)on the Sunway TaihuLight supercomputer.Furthermore,we implement the valence and torsion angles computation without a three-body-list and propose a line-locked software cache method to eliminate write conflicts in the torsion angle and valence angle interactions resulting in an order-of-magnitude speedup on a single Sunway TaihuLight node.In addition,we achieve a speedup of up to 3.5 × compared to the KOKKOS package on an Intel Xeon Gold 6148 core.When executed on 1,024 processes,the implementation enables the simulation of 21,233,664 atoms on 66,560 cores with a performance of 0.032 ns/day and a weak scaling efficiency of 95.71%.Finally,a novel Layered Materials Force Field(LMFF)was independently developed on a new-generation Sunway supercomputer through in-depth exploration and experience accumulation of MD algorithm in the above two works.It is based on the LAMMPS framework,and expands the functions of LAMMPS on the basis of its many-body potential Tersoff and Inter-Layer Potential(ILP),and has the advantages of high efficiency,scalability,and portability.This paper first proposes an idea of dynamically solving the normal vector for the calculation of ILP potential and integrates the calculation of attractive and repulsive interactions.Then a write buffer to reduce the number of write operations is designed,and the refactoring of the ILP algorithm is completed.Subsequently,the characteristics of the potential functions of the Tersoff and ILP potentials are analyzed in-depth,and the idea of a general short neighbor list is proposed,thereby designing and implementing a general LMFF force field based on the LAMMPS framework.LMFF is designed to study layered materials such as graphene and boron hexanitride.It is universal and does not depend on any platform.This work has also carried out a series of optimizations on LMFF and the optimization work is carried out on the new generation of Sunway supercomputer,called SWLMFF.The optimization of SWLMFF proposes a delay construction and recalculation check mechanism to avoid the operation of filtering the short neighbor list from the long neighbor list at each time step.Moreover,a bit marking method to reduce the number of memory accesses of the short neighbor list is designed.Experiments show that our implementation is efficient,scalable,and portable.When generic LMFF is ported to Intel Xeon Gold 6278C,a speedup of 2 ×performance improvement is achieved.For the optimized SWLMFF,the overall performance improvement is nearly 200-330 × compared to the original ILP and Tersoff potentials.And SWLMFF has good parallel efficiency of 95%-100%under weak scaling with 2.7 million atoms on a single process.The maximum atomic system simulated by SWLMFF is close to 231 atoms.And nanosecond simulations in one day can be realized,greatly improving the efficiency of researchers.This work studies the parallel optimization of MD simulation algorithms on the heterogeneous many-core supercomputer Sunway TaihuLight and the new generation supercomputer.This work uses the MD simulation software LAMMPS as the research object to demonstrate the development route from algorithm transplantation and optimization to self-research work.This research work takes the complex reaction force field ReaxFF as the starting point.First,large-scale parallel simulation of chemical reactions is realized,and then in-depth and fine-grained optimization is carried out.These laid a foundation for the independent research and development of an efficient and scalable layered material force field LMFF.Throughout the above research contents and results,the method proposed in this work can complete large-scale MD simulations on heterogeneous many-core supercomputers,which greatly improves the work efficiency of scientific researchers.At present,the LMFF has been used to study the tribological properties of layered materials.In the foreseeable future,the work results of this article will serve more scientific researchers and will further promote the development and application of the domestic Shenwei supercomputer.
Keywords/Search Tags:Molecular dynamics, LAMMPS, Supercomputing, System Scala-bility, Massive parallelization
PDF Full Text Request
Related items