Font Size: a A A

Study On High Performance Computing Method In Phylogenetic Tree Likelihood Estimation Of Protein

Posted on:2019-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y C LiFull Text:PDF
GTID:2428330551457977Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Current parallelism provided by GPU architecture has enabled domain scientists to speed up the model-based phylogenetic analysis methods at a much faster rate and finer granularity than what was previously possible using large-scale computer clusters.However,the lack of a systematic and reliable approach for parallelizing the analysis of protein sequences impedes the effectiveness of using a wide range of phylogenetic analysis tools on GPUs.This paper attempts to bridge this gap by proposing a new parallelization approach.We study the impact of workloads,resource utilization,and variance in load-levels when calculating conditional likelihood probabilities.When then propose a more efficient method tgpMC~3 for parallelizing the phylogenetic analysis of protein sequence data on GPUs.In order to ensure that the length of input sequence reaches an integral multiple of the number of threads contained in a thread block,the proposed method uses NEs to reconstruct the CLPs matrix which not only eliminates the cost of conditional branch divergence,but also decreases the GPU code-level optimizations.Moreover,an improved semi-task parallel model is adopted to increase the number of active threads that can be dispatched on GPU hardware.The method further harnesses an intermediate hash table to store keys and values,by which a transition probability matrix with fuzzy states can easily be transmitted to shared memory.Compare with the serial MC~3 algorithm implemented on a single CPU core,the method proposed in the theme obtains the maximum speedup of 117.91ืon National Tianhe-1A supercomputer's GPU nodes.Compare with taMC~3 algorithm,a novel GPU-based method for analyzing the datasets which are composed by protein sequences,our method outperforms it by speedup factors ranging from 1.79?2.36ื and 2.35?2.66ื when analyzing simulated and real-world datasets on a single GPU nodes.The experimental results suggest that the proposed large-scale computing method in phylogenetic tree likelihood estimation has outstanding advantages.
Keywords/Search Tags:High performance computing, heterogeneous computing, phylogenetic analysis, MC~3 algorithm, CUDA programming
PDF Full Text Request
Related items