Font Size: a A A

MolecularRank: An Algorithm For Finding Key Compounds In Molecular Similarity Network

Posted on:2016-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:D Y KongFull Text:PDF
GTID:2308330461467263Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Drug development is a very expensive and time-consuming process. Virtual Screening (VS) requires a series of calculation methods to select lead compounds from large compounds databases. Compounds with similar structures may have similar physical and chemical properties. The similarity-based VS highly depend on the specific target’s structure-activity relationships. In drug discovery programs, generation of chemical structures from chemical names and/or structural images in patents and literature is the initial step necessary to perform various analyses using the structural information. Predicting key compounds in the patent application is also an important work. The key compounds are key nodes located in the network. If the compounds are datasets used in the early stages of drug discovery, the key compounds may be the compound with the optimal physicochemical properties, the most biologically active tool or probe, or the most suitable pharmacokinetic profile for the desired indication. If the compounds dataset is extracted from a patent specification, the key compound will most likely be the drug candidate.Characterization of chemical libraries is an essential task in the analysis of chemical libraries. This study describes a potential use of network analysis method to identify key compounds of compound libraries. Molecules were ordered into networks by their structural similarity defined by molecular fingerprints using the Tanimoto method. Based on the Google PageRank algorithm, we proposed MolecularRank algorithm to predict key compounds in molecular similarity network. We chose the large-scale drug screening database ChEMBL for data resource of this study. Further, To meet the computing needs of big data, we developed MolecularRank to a parallel algorithms using MapReduce. Through iterative calculation of the corresponding state transition matrix of molecular similarity network, MolecularRank can accurately find the key compounds located in the important nodes in the molecules network. Finally, we use MolecularRank algorithm to predict key compound in a patent specification. By compared with the cluster seed analysis described by Hattori et al., MolecularRank algorithm show a significant advantage due to its insensitive to the similarity threshold.
Keywords/Search Tags:Virtual Screen, Molecular Network, MolecularRank Algorithm
PDF Full Text Request
Related items