Font Size: a A A

Parallel inductive logic programming for pharmacophore discovery

Posted on:2003-09-29Degree:Ph.DType:Thesis
University:University of LouisvilleCandidate:Kamal, Ahmed HusseinFull Text:PDF
GTID:2468390011485013Subject:Computer Science
Abstract/Summary:
Inductive logic programming has attracted great interest within the machine learning, artificial intelligence, data mining and bioinformatics communities because of its logical foundations, its ability to utilize background knowledge and structured data representations, its comprehensible results, and its noteworthy application successes, as in case of structure-based drug design.; This Dissertation presents a parallel inductive logic programming system and its application to search for a maximal pharmacophore for angiotensin converting enzyme (ACE) inhibition drugs and a combinatorial library to inhibit Pseudomonas aeruginosa and evaluates its performance. The system uses the concept of message passing to allow parallel tasks to communicate with one other, and it was built on top of a PVM layer to insure portability between wide varieties of hardware platforms. The system shows ease of parallelizing sequential ILP systems based on the concept of hypothesis space partitioning on a loosely coupled collection of parallel inductive tasks. A new potential ACE inhibitor pharmacophore with five active sites was discovered.; This parallel ILP architecture should be adaptable to a wide range of inductive logic programming problems where the hypothesis space can be distributed among the parallel processors for testing against the data set. The system demonstrated a normal linear speedup with respect to the number of processors. In one experiment, a speedup in excess of 91% was obtained on a sixty-four processor system. It was determined that three main factors affect the efficiency of this system, namely: the interprocessor communication cost, the checkpoint backup cost, and the degree of hypothesis partitioning. The degree of partitioning the hypothesis space affects the amount of the interprocessor communication traffic and may lead to increasing or decreasing the communication traffic according to the degree of granularity of the hypothesis space partitioning. Although high frequent checking points degrades the overall performance of the parallel system due to the backing up cost and communication cost between the parallel tasks and the network file system (NFS), it is necessary in practice to use this feature to recover from system crashes in long running experiments.
Keywords/Search Tags:Inductive logic programming, Parallel, System, Hypothesis space, Pharmacophore
Related items