Font Size: a A A

Research And Analysis Of Large-Scale Virtual Screening Docking Results

Posted on:2013-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2248330371987099Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Molecular docking is one of the most important means of virtual screening. Its aim is to speed the drug discovery process by simulating the interaction between small molecule ligands and receptors of biological macromolecules with computer technology. However, molecular docking is a computationally intensive task. For a target protein, docking process usually involves millions or even tens of millions of molecular compounds. This process not only involves a large amount of calculation, but also will produce massive docking result data. Therefore, how to do preliminary screening and let the molecules with high drug-like dock priority is very meaningful to shorten the cycle of the virtual screening and reduce the costs of pharmaceutical.This paper will take analyzing the massive result data produced by virtual screening which target Avian influenza virus H5N1for example, and explore the large-scale distributed data analysis methods. In order to enable the chemists analyze docking result data conveniently, the paper carried out the following three areas of works:Firstly, using Hadoop cloud platform to compute and obtain the molecular data and docking result data which are be involved in virtual screening. Secondly, this work provides effective data for data analysis. For meeting the needs of chemists, this paper provides a data sampling strategy; Finally, using MapReduce and Mahout to analyze the relationship between the properties of small molecules and scoring function value.
Keywords/Search Tags:Data analysis, Molecular docking, Hadoop, MapReduce, Mahout
PDF Full Text Request
Related items