Font Size: a A A

Parallel Algorithm Analysis And Optimization Of Plasma Structure Preserving Large-scale Simulation On Sunway Platform

Posted on:2021-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:F LuFull Text:PDF
GTID:2428330602498993Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of processor technology and the increasing demand for com-puting by applications,the number of resources and cores integrated on the micro-processor chip is increasing,showing the development trend of heterogeneous many-core.China's most powerful supercomputer,Sunway TaihuLight,consists of 40960 multi-core heterogeneous processors SW26010 which is designed independently.The SW26010 processor uses a special master-slave structure,which integrates 260 het-erogeneous computing cores on the chip and provides up to 3.06TFlops of peak per-formance.However,its unique hardware architecture also poses great challenges to the development of parallel software.SymPIC is a simulation software for the Vlasov-Maxwell plasma system using the symplectic structure preserving algorithm,which can support long-term large-scale and efficient plasma problem simulation.The research goal of this thesis is based on the plasma simulation software SymPIC,re-implementing and optimizing the parallel algorithm of large-scale plasma simulation for the Sunway many-core platform,overcoming a series of difficulties in the hardware architecture,fully releasing the hardware computing power,and providing reference opinions for the improvement of the structure of the domestic supercomputing system from the ap-plication point of view.The main research contents of this thesis include:(1)In order to deeply understand the code structure and runtime behavior char-acteristics of SymPIC,we conduct a lot of analysis on the commercial platform.On the one hand,we analyze the program structure of SymPIC,introduce its constituent modules and the function call relationship.We also give the calculation core code struc-ture and data structure on this basis.On the other hand,SymPIC is comprehensively tested and analyzed using performance analysis tools on commercial platforms,and its behavior characteristics of calculation,memory access,communication,and I/O are clarified,which provides a reference for optimization work on the Sunway platform.(2)We introduce SymPIC's parallel algorithm and optimization strategies on Sunway.First,the details of parallel algorithms such as task division,core code structure and parallel programming model are given,and a targeted bottleneck analysis is carried out.To solve the problem of low computational efficiency,We propose two vectorization strategies with different granularities to fully exploit its data-level parallelism,and to a certain extent alleviates the problem of high instruction cache miss rate.In order to overcome the severe memory access limitation,DMA operations and data rearrange-ment are used to accelerate the movement of data,and software simulation Cache and multi-buffer prefetch strategies are designed to achieve data reuse and memory access cost hiding.Finally,a distributed I/O scheme is proposed to ensure large-scale I/O performance and control the output file overhead within an acceptable range.(3)A detailed and comprehensive experiment is designed to evaluate the perfor-mance of SymPIC on Sunway.The experimental results show that compared to the ver-sion using only MPE or both MPE and CPE clusters,the optimized SymPIC achieves an acceleration ratio of 88.30 times and 2.57 times,and the parallel efficiency of the strong and weak scalability can reach more than 86%and 94%respectively.This thesis also analyzes the software and hardware limitations of SymPIC on Sunway,and provides a reference for the improvement of domestic supercomputer hardware and software.
Keywords/Search Tags:Parallel Algorithm, Optimization, Sunway TaihuLight, Heterogeneous Many-Core Processor, Plasma Simulation
PDF Full Text Request
Related items