Font Size: a A A

Parallel Implementation And Performance Optimization For FHI-aims On The Sunway Many-core Architecture

Posted on:2022-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:E Y ZhaoFull Text:PDF
GTID:2518306353983629Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,supercomputers have played an increasingly prominent role in national security,scientific and technological innovation,economic development and other aspects.Sunway Taihulight,a new generation of supercomputer system independently developed in China,has a peak computing capacity of 1,254 billion times per second.It is the first 100 P system in the world,ranking fourth in the world.The performance of Linpack is as high as PFLOP,and SW26010 processor is used as the main computing node.The machine has a total of 10.4 million core composed of 40,960 SW26010 processors.FHI-aims is a commercial first-principles approach to computational simulation that simulates both molecular and solid systems.Due to its powerful functions,excellent accuracy and smooth user experience,IT is widely used in universities and research institutions around the world and is considered one of the main competitors of VASP.The research objective of this paper is to realize the transplantation and optimization of FHI-AIMS software on the system and provide a reference for the transplantation and optimization of other large-scale parallel software.Study the properties of nanomaterials with wide application prospects,simulate and even design the next generation of photoelectronic devices and Raman spectral simulation of large molecular crystals to understand the mechanism of chemical reactions.The research work and achievements of this paper mainly include :(1)parallel work of fhi-aims on the multi-core platform.This project focused on the primary principle of massively parallel program development and the development,make full use of all the core processor system structure to realize the kernel thread tasks in parallel,data-parallel and parallel hybrid parallel line,improve the efficiency of the nuclear parallel,adopt efficient communication to improve the performance of to fetch and to the quantitative optimization method is used to improve the computing performance.(2)For the optimization of FHI-AIMS on the core,we first carried out the core access optimization.While making full use of LDM,we reduced the cost of access by optimizing the data structure.Also,vectorization instructions are used to carry out the same operation for different operational data sources at the same time,and the data-level parallelism contained in the source code is explored.SIMD instructions are used to replace scalar instructions in the executable code to improve the running efficiency of the program.The parallel strategy of communication computing hiding is proposed.The communication time can be completely hidden by the calculated time,and the tasks of FHI-AIMS on CPE can be divided,and the time of core calculation and update iteration can be hidden.(3)A global real-time task scheduling strategy based on intelligent fitting for high-performance multi-core systems is proposed.An improved intelligent fitting algorithm is used to build a scheduling model based on the caching access ability of tasks.This model can be used to classify tasks and optimize the global real-time task scheduling strategy in high-performance multi-core systems.
Keywords/Search Tags:parallel computing, FHI-aims, Sunway Taihu Light system, SW26010 Heterogeneous many-core Processor, Task scheduling
PDF Full Text Request
Related items