Font Size: a A A

Analytical Studies Of Large-scale Virtual Screening Data Based On Hadoop

Posted on:2015-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:G C LiuFull Text:PDF
GTID:2268330431450996Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Virtual screening is an important method for new drugs discovery, which mainly searches the existing small molecule database to find out the qualified small molecule compounds and screens the lead compounds. So far, only about10%of all the found more than20000thousand kinds of organic small molecule compounds have been used in the virtual screening. And a large number of potential compounds still have not been found. Therefore, it is of important realistic meaning to implement all the small molecular compound system of virtual screening and analytical studies of large-scale virtual screening data of new drugs discovery.In this paper, we utilize H5N1avian flu virus as target and small molecular compounds are derived from the database ZINC as ligands to do the analytical studies of large-scale virtual screening data based on hadoop. One important work is the implement of the parallel operation of large-scale data using the chemical development kit CDK. Then we use the RHadoop distributed data analysis platform and adopt the support vector regression method to realize the statistical analysis of small molecule compounds data and docking results data in virtual screening. Main work in this paper is introduced as follows:1) Realize the chemical development kit CDK’support for the programming framework of MapReduce, and on this basis, build CDK-Hadoop platform;2) Completing the analysis work of small molecular compounds properties in the ZINC database based on CDK-Hadoop;3) Achieve the RHadoop platform construction on the basis of Hadoop platform, and use the support vector regression method to apply the RHadoop platform in the large-scale virtual screening data analysis.
Keywords/Search Tags:Virtual Screening, Data Analysis, RHadoop, MapReduce
PDF Full Text Request
Related items