Font Size: a A A

Research On Parallel Technology Of High Throughput Molecular Docking

Posted on:2023-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y H CuiFull Text:PDF
GTID:2530307097494674Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Molecular docking is a powerful tool for virtual drug screening and at the same time,it can also be used to detect potential drug targets and analyze the principle of drug target interaction.At present,there are a large number of organic molecules available for molecular docking,and the popular molecular docking methods usually do not support large-scale docking data.Although some excellent docking software supports multi-threaded parallel docking on a single computer,fine-grained parallel on a single computer node can not significantly improve the speed of large-scale molecular docking tasks.With the expansion of molecular docking sampling scale,the quality of screened target molecular compounds will be improved.Therefore,it is necessary to speed up molecular docking with the expanding scale of molecular docking.Aiming at the above problems,based on the general molecular docking software D3DOCKV2,aiming at the defects of its multithreading tool,such as low parallel efficiency,limited parallel scale and inability to perform docking tasks on multi node clusters,three parallel schemes are designed on three computing platforms: parallel molecular docking program mp D3DOCKV2 based on Message Passing Interface(MPI);Hadoop based parallel molecular docking program mr D3DOCKV2 and Spark based parallel molecular docking program IDOS(Improved D3DOCKV2 On Spark).Large scale molecular docking experiments were carried out on supercomputers and cloud computing clusters,and the three parallel schemes showed good acceleration effect and scalability,providing a variety of computing platform options and powerful and effective computing guarantee for coping with large-scale molecular docking calmly.In this paper,the following parallel technology research of molecular docking algorithm is carried out:(1)mp D3DOCKV2: the message passing interface MPI is used to accelerate the molecular docking of D3DOCKV2 on the Tianhe supercomputer.After the grid maps reuse and compilation optimization,the acceleration effect and docking accuracy of D3DOCKV2 on the supercomputing cluster are tested with multi threads.(2)mr D3DOCKV2: explored the feasibility of big data framework accelerating D3DOCKV2,designed a scheme for molecular docking on cloud computing cluster,and parallelized molecular docking tasks by using the native Map Reduce computing framework of Hadoop cluster.Verified the feasibility of the big data computing cluster to perform the molecular docking task,and obtained a near linear acceleration ratio on the cloud computing cluster.(3)IDOS: on the basis of the second work,the computing framework is replaced by spark based on memory computing,which optimizes the data processing process,and realizes 5-6 times the acceleration on the cloud computing cluster with a total of 6computing nodes,with a parallel efficiency of more than 80%.Finally,a general molecular docking interface is provided,which can replace the accelerated object program in this paper with other molecular docking programs,and encapsulate the molecular docking cluster operating environment in the docker image,so as to facilitate the rapid and flexible deployment of applications by scientific researchers.
Keywords/Search Tags:Molecular Docking, Parallel Computing, MPI, Hadoop, Spark
PDF Full Text Request
Related items