Font Size: a A A

Efficient Parallelization of Non-uniform Fast Multipole Algorithm

Posted on:2019-02-08Degree:Ph.DType:Thesis
University:Michigan State UniversityCandidate:Hughey, Stephen MichaelFull Text:PDF
GTID:2470390017988519Subject:Computational physics
Abstract/Summary:
Many applications of the N-body problem today involve distributions of bodies that are (i) very large and (ii) highly non-uniform. A variety of fast multipole algorithms have been devised to reduce the cost from O(N2) to O(N log N) or O(N) for oscillatory and non-oscillatory problems, respectively. The issue of non-uniformity, however, presents significant challenges in parallelization, requiring a much more nuanced approach. Compounding this challenge, oscillatory N-body problems arising from wave physics (electromagnetics, acoustics, etc.) are burdened with capturing both phase and amplitude information as opposed to just the amplitude; non-uniformity even further complicates things. As a result, the algorithm and underlying data structures become extremely complicated, and parallelization becomes quite difficult.;This thesis aims to develop novel parallel fast multipole methods for both oscillatory and non-oscillatory problems that (i) are controllably accurate to arbitrary precision, (ii) are capable of efficiently handling highly non-uniform distributions, and (iii) scale well up to extremely large problem sizes and numbers of CPU cores. The accelerated Cartesian expansion (ACE) method and wideband multilevel fast multipole algorithm (MLFMA) are modified to accurately and efficiently accommodate non-uniform, and in the case of MLFMA extremely large, distributions in parallel. Several parallel algorithms for efficiently building the distributed non-uniform tree data structures are developed. Effective, novel algorithms are introduced to reduce load imbalances arising from non-uniformity and certain idiosyncrasies of the parallel wideband MLFMA which hamper scalability. The algorithms presented here meet each of the stated goals, enabling computations involving several hundred million degrees of freedom on 2048 cores for an electromagnetics problem and several billion particles on 16,384 cores for non-oscillatory problems.
Keywords/Search Tags:Non-uniform, Fast multipole, Non-oscillatory problems, Parallel
Related items