Font Size: a A A

Computational Methods For Drug Target Identification Based On Data Mining

Posted on:2014-10-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y GongFull Text:PDF
GTID:1261330425480897Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The process of new drug discovery and development is time-consuming, costly and risky. The average cycle for new drugs development generally reaches15years, which consumes more than800million US dollars. However, due to the low efficiency and side effects, the drug development often fails in the clinical trial phase, causing huge lost. As the origin of drug development, drug target discovery and identification is critical to the success rate of drug development. With the development of bioinformatics technology as well as fast growing of proteomics and chemical genomics data, the combination of computational chemical biology and traditional experimental techniques provides information support to drug target discovery and novel methods for drug target prediction. Aiming at drug target discovery, this thesis will construct drug target database and pesticide target database by using computational methods to perform data mining and integration on the existing drug target database, and developed several methods and tools on drug target discovery. This thesis includes the following five parts:1. Based on the existing drugs and indication for drugs under development, we constructed a drug target database called TargetBank by means of text mining and manual validation. This database included4,357records of target proteins, covering23therapy domains and more than600clinical indications. This database also included detailed function annotation for each protein record, providing useful reference for drug development researchers.2. We constructed the first database of the interaction network for pesticides and targets, named PTID. This database contained1,342records of pesticides by means of data integration, which was divided into22categories by function, and provided with annotation on environmental fate and ecotoxicology. By means of text mining,4,245records of protein targets interacted with the pesticide are collected, and an interaction network between pesticides and targets was constructed with detailed annotation on protein sequence and function. This database also provided computational tools such as similarity search and sequence alignment, offering data support on pesticide research.3. We designed and implemented a molecular solvent-accessible surface generation program based on iso-surface theory. This method adopted a rapid one dimensional recursive Gaussian filter calculation, constructed iso-surface in the discrete three dimensional spaces and obtained the triangle representation of molecular solvent-accessible surface by using iso-surface extraction technique "Marching Cubes". Central difference method was applied to calculating the normal vector of triangles for better render effect. The test result indicated that this method was both fast and accurate, and could provide drug target surface representation for structure-based target identification. This program has been integrated into the D3Pharma software package.4. We developed a random-walk-based polypharmacology method, combining the molecular similarity methods and network pharmacology concepts. This method introduced the target compound into an existing drug-target network using molecular similarity, then discovered its polypharmacology effect and predicted its new targets or side effects by bipartite and random walk based network inference algorithm. Through prediction of recent reported off-targets events, this method was proved to be capable of predicting polypharmacology effect.5. Based on a molecular similarity evaluation program SHAFTS developed by our research group, this thesis implemented a molecular similarity based target discovery and virtual screening platform ChemMapper. This platform collected the structure information for more than3.5million compounds, with400thousands of them possessing target annotation. This platform can perform target or side effect prediction against a query structure based on three dimensional molecular similarity methods; it can also perform virtual screening or scaffold hopping research on commercial compound databases. Through the precise prediction of toxicity for the existing drug Astemizole, the lead compound discovery and scaffold hopping validation for EGFR, ChemMapper could facilitate the research on genomics, target discovery, polypharmacology and virtual screening.
Keywords/Search Tags:Target Identification, Network Pharmacology, Polypharmacology, Data Mining, Molecular Similarity
PDF Full Text Request
Related items