Font Size: a A A

Parallel Algorithms Research Of Particle Transport On Heterogeneous Architecture

Posted on:2012-12-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Y GongFull Text:PDF
GTID:1118330362460507Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In many physical phenomena, the particle transport equation (Boltzmann equation) will beused to describe the particle transport process. For example, the neutron transport equation isappliedtocalculatetheneutrondistributionofthecoreandtheshieldingdesigninnuclearreactors.in the biomedical field , the transport theory are used to determine the dose of radiation therapy.In astrophysics, semiconductor physics, plasma physics, cosmic ray shower, gas discharge physicsand other disciplines, the neutron transport theory has been used to study the transport problem ofphotons, electrons, plasma and other particles. There are usually two types of numerical methodsto solve the Boltzmann transport equation. The first one is a deterministic approach, including thetime difference, the space difference method, finite element method, multi-group approximation,the discrete ordinates method for the angle direction and spherical harmonics method. The secondoneiscallednon-deterministicorMonteCarlomethod,whichgetsthedesiredresultsbysimulatinga single particle history and synthesizing the information of many particles'history.The traditional high-performance computers generally use commercial general-purpose pro-cessors, and massively parallel computing systems with this kind of structure will face many chal-lenges, such as system efficiency, power consumption, system maintenance and cost. In recentyears, heterogeneous architecture becomes a trend of constructing super-computer system. Com-pared with the traditional parallel computer system, the heterogeneous computing system has thecharacteristics of explicit micro-vector SIMD data level parallelism, besides the multi-core paral-lelism and the implicit hardware instruction level parallelism.Thisthesisfocusesonthedata-levelparallelalgorithmofbothdeterministicandnon-deterministicparticle transport numerical methods on heterogeneous architecture. The research achievementsinclude the following aspects:1. This thesis proposes a three-dimensional Cartesian geometry-based particle transport gridlevel data parallel algorithm. The algorithm overcomes the restriction of the limiting concurrencyin the flux sweep, and exploit the parallelism in the grid computation. In this algorithm, the recur-sive discrete ordinate equation is parallelized effectively, and the GPU hardware storage structureis used to improve the speed of memory access and reduce the cache failure rate. The results showthat for the problem model without flux fixup, GPU gets 2.56 to 8.14 times speedup comparedwith multi-core CPU. For the problem model with flux fixup, GPU can improve performance withfactor 1.23 compared with multi-core CPU. 2. The thesis proposes a data level parallel algorithm for discontinuous Lagrange finite ele-ment method and discrete ordinates method under two-dimensional cylindrical coordinate system.The parallelism of numerical algorithm is abstracted into wavefront parallelism. The wavefront,the concurrent computation of mesh grid and energy group are mapped onto kernel, thread blockand thread of the CUDA thread execution model. The whole transport sweep process is dividedinto two steps: the pre-sweep algorithm is used to determine the sweep order, and the parallelsweep algorithm performs the actual flux calculations. The feasibility of the pre-sweep algorithmand the parallel degree of parallel flux sweep algorithm are analyzed. The design method of hier-archical heterogeneous parallel algorithm is proposed from the perspective of loop unrolling andsplit. Experimental results show that GPU can get 11.03 to 17.96 times speedup compared withsingle-core CPU.3. The thesis proposes a non-deterministic particle transport data-level parallel algorithmbased on MCNP. The algorithm overcomes the memory access conflict introduced by parallelmulti-threads. A new data structure is designed to satisfy the coalesced GPU global memory ac-cess and the requirements of massive non-deterministic simulation. Based on the features of G-PU hardware architecture and non-deterministic particle transport calculations, several optimizingmethods including the rational use of GPU memory hierarchy and simplifying the GPU kernelcode. The parallel pseudo-random number generation and the sample of angular distribution arepresented as examples. Experimental results show that, GPU can get 16.3 to 23.67 compared withsingle-core CPU.4. The heterogeneous scalable parallel particle transport framework is designed on the basisoftheheterogeneousparticletransportalgorithm. Thehierarchicaldesign, heterogeneoustypedef-initions, data structure design, module design, interface design of the framework is described. Theparallel framework can provide efficient parallel processing capabilities and screen correspond-ing implementation details. Two test cases including three-dimensional structured grid particletransport on multi-GPU and a MC benchmark on the heterogeneous CPU/GPU platform.In summary, this thesis proposes an effective solution for scalable particle transport parallelcomputation. The solution has theoretical and applying value to promote the theoretical researchand application of scalable parallel particle transport computation on heterogeneous architecture.
Keywords/Search Tags:particle transport parallel algorithm, heterogeneous architecture, GPU, Sweep3D, MCNP, Monte Carlo method, discrete ordinates method, unstructured grid
PDF Full Text Request
Related items