Font Size: a A A

Research And Practice Of Optimization Technology For Typical Supercomputer Applications

Posted on:2022-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2518306491984319Subject:computer science and Technology
Abstract/Summary:PDF Full Text Request
High-performance computing has become the third largest scientific research method and means after theoretical analysis and scientific experiments.It uses advanced computing capabilities to solve complex and large-scale scientific problems that cannot or are difficult to solve theoretically and experimentally.It has become a scientific research innovation in universities.An indispensable important support.Especially in recent years,the integration of high-performance computing,artificial intelligence,and big data has required more computing power to support the development of various disciplines in universities.In recent years,the computing power of supercomputing platforms in domestic universities has exceeded petabytes,and many computing resources have the powerful computing power of the new generation of Intel Cascade Lake scalable processors,but how to make full use of these computing resources to accelerate the speed of scientific research has become The great challenges faced by users.And some system-level,compiler-level,and code-level optimization techniques can effectively help these applications improve computing performance,increase computing efficiency,and give full play to computing power.This paper takes several typical applications of a university's supercomputing platform as an example,and studies four different optimization techniques to improve the computing efficiency of the application and give full play to the utilization of the platform's computing resources.This paper first focuses on optimization techniques such as pipeline,superscalar,and SIMD related to the hardware level,parallel parameter adjustment and thread nucleophilicity related to the runtime level,and compile-time optimization options,parallel compilation,and mathematical library optimization related to the compiler level.,And the data parallel and task parallel involved in the code level are introduced and analyzed in detail,and then through the actual application of a university supercomputer platform such as VASP,WRF,cryo-three-dimensional electron microscope reconstruction application and Hartree-Fock theory based on relativity The tensor force extraction application analyzes their performance bottlenecks,proposes an optimization scheme for their performance bottlenecks,and finally verifies the feasibility of the optimization scheme through experiments.From the experimental results in this article,in the VASP application,the default NPAR and NCORE parameters can be optimized to increase the maximum operating speed by 6.4 times;for the WRF application based on the numerical prediction model,the calculation performance of the WRF optimized by the GCC compiler can be improved.It reaches 23.77%,but the calculation speed of WRF optimized by the Intel compiler can be improved compared with the performance of WRF compiled by the GCC compiler;in the image similarity algorithm of the 3D cryo-electron microscope reconstruction program,the parallel ability of using multi-core CPU is realized The image similarity calculation algorithm is optimized,and the OpenMP multi-threaded parallel technology based on shared memory is used to analyze and explore the parallel capabilities of the image similarity calculation algorithm in Fourier space,and the calculation process has been optimized in parallel at multiple levels,The optimal speedup ratio reached 61.103;in the application of tensor force extraction based on the Hartree-Fock theory of relativity,in view of the parallelism of the algorithm,the optimization of the parallel algorithm using the OpenMP-MPI hybrid model was realized,and the CPU utilization rate was changed from the original algorithm.The ratio of less than 1% has increased to 94.8%,the vectorization ratio has increased to100%,and the speedup ratio has reached 36.779.
Keywords/Search Tags:Supercomputing application, Optimization, Parallel operation parameters, Parallel Algorithm
PDF Full Text Request
Related items