Font Size: a A A

Research Of Unbalanced Tree Search Based On Dynamic Granularity Strategy

Posted on:2017-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:L GuanFull Text:PDF
GTID:2428330569998693Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The increasingly developed high performance computer systems expose more parallelism to applications required massive calculation.However,some irregular and unbalanced applications pose a significant challenge to fully utilizing the power of current SMP cluster systems,because these applications always need asynchronous and continuous dynamic load balancing.The UTS benchmark is designed to character the performance of such kinds of applications through an exhaustive parallel depth-first search of a tree-structure search space that requires continuous dynamic load balancing.The standard OpenMP and MPI implementations of UTS are suitable for shared memory and distributed memory architecture,respectively.Nevertheless,these two implementations have notable drawbacks.To begin with,they cannot fully explore the computing capability of current multi-core SMP clusters.The standard OpenMP implementation of UTS is unable to directly extend to the whole cluster system since the OpenMP threads cannot access cluster-wide memory.Although the standard MPI implementation of UTS can run on the SMP clusters,it suffers poor scalability.In addition,both of these two unbalanced tree search implementations adopt a fixed steal granularity strategy,in order to obtain a state of load balance.Nevertheless,they cannot achieve a perfect load balancing state.UTS uses work stealing to achieve continuous and asynchronous load balancing.Regarding to the drawbacks of standard UTS benchmark,this paper introduces a dynamic granularity work stealing algorithm which is based on the MPI+OpenMP hybrid programming model,in order to accelerate unbalanced tree search and improve the scalability.The main contributions of this paper are stated as follows.First,based on the dynamic granularity strategy,we optimized the standard OpenMP implementation of UTS and implemented a new unbalanced tree search version which is suitable for shared memory systems.In this optimized unbalanced tree search implementation,the steal granularity changes dynamically with the number of released idle threads.This is to meet the work demand of every idle thread and promote load balancing among the working threads.Compared with the standard OpenMP implementation of UTS,our unbalanced tree search implementation suitable for shared memory systems can enlarge the interval of steal granularity as well as achieve high performance.Second,adopting the dynamic granularity strategy,we optimized the standard MPI implementation of UTS and implemented a new unbalanced tree search.After this,using the MPI+OpenMP hybrid programming model,we combine the above two different work stealing strategies.Our new work stealing strategy is proved to be suitable for multi-core SMP clusters and demonstrates strong scalability.Compared with the standard MPI implementation of UTS,our unbalanced tree search implementation suitable for SMP cluster demonstrate high performance and strong scalability.When scaling to 128 nodes,our implementation demonstrates high parallel efficiency(over 74%).We observe a performance improvement of up to 128% over the standard MPI implementation of UTS with a maximum search rate of around 3.39 billion nodes per second,for the benchmark studied in this paper.
Keywords/Search Tags:Load Balancing, SMP Cluster, Work Stealing, UTS, Dynamic Granularity, MPI+OpenMP Hybrid Programming Model
PDF Full Text Request
Related items