Font Size: a A A

Performance Evaluation And Application Porting Optimization Of HPC System Towards ARM Architecture

Posted on:2021-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z R GeFull Text:PDF
GTID:2428330611452104Subject:EngineeringˇComputer Technology
Abstract/Summary:PDF Full Text Request
The world is moving towards the age of diversity computing,and Moore's Law is being reconstructed.Though the high-end server market for high performance computing(HPC)was generally occupied by Intel's and AMD's x86 architecture processor,which perform a high performance and strong universality.The x86 processors are facing energy-consumption issues when building a next generation exascale supercomputer.Advanced Reduced Instruction Set Computer(RISC)Machine(ARM),which has been active in the mobile embedded terminal market,is expected to show its advantages in HPC system's energy consumption bottlenecks with low power consumption and high performance.However,it is unknown what exactly the performance is and what sort of the right application is,and this is still the most concerned issue for researchers in HPC filed.Most previous researches only focused on the energy consumption not the performance when it refers to ARM.There are still lack of works about the ARM performance evaluation,not even the application porting and optimization for ARM computing architecture.Firstly,the thesis analyzed the mainstream computing architecture in HPC's field and their market,focusing on the ARMv8-A architecture and its characteristics;Secondly,the thesis proposed a test object-based evaluation system in consideration of the features of HPC's application.For the system objects,the thesis analyzed the difference in performance between the ARM(HUAWEI Kunpeng920 processor)and x86(the commercial Intel Xeon 6146 processor)system by selecting HPL,HPCG,STREAM,IOZone,OMB and other benchmarks,focuses on evaluating the floatingpoint performance,continuous memory bandwidth,disk read and write performance,and network performance of them.In addition,the thesis took two typical HPC applications as the practical application,and analyzed the actual performance of the single-core,multi-cores,and multi-nodes of two platforms by adopting the computation speed,performance benchmark and so on;Finally,take the GROMACS as an example,the thesis explored the application porting process when users want to port their own software from x86 system to the TaiShan server,and we made a step work to optimize the GROMACS in hardware and software aspects.According to the experimental results,the thesis found that the floating-point arithmetic of Kunpeng920 processor is about one-third of the commercial Xeon 6146 processor.In terms of sustainable memory bandwidth on single node,the Kunpeng920 processor can achieve near-linear memory bandwidth growth because of its multi-core and multiple memory channel.In the aspect of disk performance,ARM performs better in read and write capability.As far as the network performance,ARM's performance in peer-to-peer communication latency is superior to x86 systems,and also do better than x86 in large file transmission.In consideration of the realistic application results,ARM is at a disadvantage in computing-intensive than the commercial Intel Xeon 6146 processor for the NAMD,but made a 2 to 5 times performance improvement in memory-intensive applications such as WRF.Furthermore,after porting and optimization the GROMACS achieved a performance surpass of 10.7% compared to x86.Considering the cost,performance loss,Kunpeng920 processor has great competitiveness in building HPC systems,and is a platform with prospect in research and application.
Keywords/Search Tags:High Performance Computing, Advanced RISC Machine, Performance Evaluation, Application Porting, Application Optimization
PDF Full Text Request
Related items