Font Size: a A A

Parallel Optimization For Multiple Sequence Alignment Based On CPU-GPU Heterogeneous System

Posted on:2019-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:X ChenFull Text:PDF
GTID:2428330593451009Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The multiple sequence alignment(MSA)is a classical and powerful technique for sequence analysis in bioinformatics.With the rapid growth of biological datasets,MSA parallelization becomes necessary to keep its running time in an acceptable level.In the scenario of single user on multiple sequence alignment,although there are a number of work on MSA problems,the large-scale datasets including the sizes of datasets and the lengths of sequences are unfortunately ignored by lots of previous work.Besides,prior studies consider the MSA parallelization on GPU devices only,making the CPUs idle during the computation.Thus,this paper aims at the problems above and presents CMSA,a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU-GPU platform.It performs and optimizes multiple sequence alignment automatically for users'submitted sequences without any assumptions.CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized.Moreover,CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn~2)to O(mn).The experimental results show that CMSA achieves an up to 11×speedup and outperforms the state-of-the-art software.In the scenario of multiple user on multiple sequence alignment,owing to the complexity of the multiple sequence alignment algorithm,there is a huge challenge for system to handle the problem of multiple users efficiently.Thus,this paper presents GMSA,a MSA system for multiple users on the heterogeneous CPU-GPU platform.GMSA focuses the ClustalW algorithm and finds that the time complexity of first step is higher.And it is feasible to avoid recomputation,reduce computing time by sharing the first step result.In conclusion,this paper focuses on large-scale multiple sequence alignment problem ofhigh genetic similarity,using the software and hardware technology of CPU-GPU heterogeneous architecture,designs the CPU-GPU heterogeneous architecture model to maximize the system resource utilization.In addition,considering the complexity of task submitted by multiple users,based on sharing strategy,this paper proposes the feasible optimization strategy.
Keywords/Search Tags:GPU, CUDA, Heterogeneous, Multiple Users, Sharing, Multiple Sequence alignment
PDF Full Text Request
Related items