Font Size: a A A

Research On Sunway OpenACC Transplant And Parallelization Of Task Map Based On Data Stream For Silicon-Crystal Application

Posted on:2020-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2518306305995809Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
”Sunway·Taihu Light" is the first supercomputer running at more than one billion times in the world.Its theoretical performance peaks at 125.4 PFlops,and the measured peak is about 93 PFlops.The system provides Sunway OpenACC,a parallel programming guidance language based on the control flow,for the application on the "Sunway Taihu Light",which can quickly realize the nuclear transfer from the original application.However,the traditional fork-join mode is extremely easy cause the problems of access congestion and collision from the core bandwidth on the SW26010 processor with limited access bandwidth,which affects computing performance.Different from the traditional mode of task execution based on the control flow,AceMesh parallel programming framework based on data flow is oriented to the practical application of grid type and is applied to multi-core and many-core platforms with data flow as the center,which can achieve random peak access and fully exploit the parallelism between tasks.The Silicon-Crystal application is a typical application of thermal conductivity molecular dynamics simulation to silicon crystals at low temperatures.It is an effective scientific simulation of the motion graph of silicon atoms under the Tersoff potential,which is one of the hot spots in process engineering research.In the force field potential energy of the application,the calculation of the potential energy between silicon atoms is very complicated.That application has the characteristics of large computational complexity and multiple time step iterations,and also has problems such as low parallel granularity and low computational efficiency.In response to the appeal issue,this paper improves the parallel granularity by performing the necessary data and code reconstruction on the silicon crystal application.On this basis,a master-slave parallelization scheme on the SW26010 processor was designed and implemented,and the Silicon-Crystal application was successfully transplanted on the "Taihu Light" using Sunway OpenACC.Considering that Sunway OpenACC based on control flow can not effectively solve the bandwidth access fetch optimization and cross-time iteration problem of the memory-intensive application,the task graph parallelization based on data flow mines the parallelism between task iterations,and uses the random peak access between tasks to improve the memory bandwidth,which achieve better performance acceleration.Experimental results show that the application through Sunway OpenACC transplantation obtains 2.26 times acceleration on its single-core group.When the time step is 1 hour,the application with transplanted task graph parallelization can obtain 2.52 times acceleration and its performance is 11.5%higher than Sunway OpenACC.When the time step is extended to 20 hours,the size of the task graph increases and the out-of-order scheduling of the task further expands the advantage of random peak access.The total application obtains 3.2 times performance acceleration,which is 42%higher than that of Sunway OpenACC.
Keywords/Search Tags:Sunway Taihu Light, Sunway OpenACC, Data flow, Parallelization of task map, MD(Molecular dynamics)simulation
PDF Full Text Request
Related items