Font Size: a A A

Study Of The Parallel Task Graph Scheduling Optimization On The Sunway Taihulight

Posted on:2019-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:X R GaoFull Text:PDF
GTID:2428330578972743Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,as the hardware of heterogeneous system has developed/developing rapidly.How to develop parallel programs for heterogeneous many-core platforms more efficiently has become one of the key factors that restricts the development of heterogeneous systems.The Sunway TaihuLight system,China's self-developed supercomputer,using the domestic many-core heterogeneous processor "SW26010",has found its way reaching to top-notch computing capabilities.However,it also faces severe difficulties in parallel programming and low efficiency while developing."SW26010" processor provides developers with accelerated thread library of "Athread library",which offers users the underlying API to control threads,but have a low programming level,and parallel programming model "SW Openacc*",that provides users with a program-parallelization method of compiling guidance,but does not provide/without providing a feasible scheme of LDM cache reusing.In order to make full use of computing resources and reduce the difficulty of programming for parallel program developers,basing on practical applications,a research towards homegrown heteronuclear heterogeneous platform has been carried out,and a parallel task programming framework that supports a domestic multicore heterogeneous platform.AceMesh,has been designed,which is a programming framework for grid applications,data-driven parallel tasks,and its underlying depends on AceMesh task scheduling system to guarantee the execution of task relying on graphs.Basing on the Athread library,facing the SW 2610 processor,reforming the ready task queue of the AceMesh task scheduling system,leading in the hierarchical task queues,this article Resigned and accomplished the cooperative scheduled support of master-slave/major-minor task,and sensitive perception of MPI blocking tasks aimed to improve the process Inter-task scheduling;designed and accomplished the management of inter-task caches reusing,to take full advantage of the LDM of CPE.Under the circumstances that the vertical reuse rate remains stable,the average total scheduling overhead of hierarchical task queues achieve a 7%-28%increase in performance,and the introduction of dynamic affinity strategy resulted in a five-fold of accelerating rate for the tend lin application while being more than 8 threads.Cache reusing makes the DMA performance of the fdtd-2d case a 10%-25%improvement,and at the same time significantly reduced the difficulty during programming,making code reuse possible.In fluid dynamics applyings,compared with SWOpenAcc program,The out-of-order scheduling of AceMesh task scheduling system contributed a acceleration of 1.43 times faster in algorith01,and 2-dimensional tasks achieved 0.7 times faster acceleration with FCTA01 and FCTA02.
Keywords/Search Tags:SW26010, AceMesh, scheduling overhead, parallelism, cache reuse
PDF Full Text Request
Related items