Statistical design and analysis of high throughput screening data using pooling experiments and data mining techniques

Posted on:2005-03-07

Degree:Ph.D

Type:Dissertation

University:North Carolina State University

Candidate:Remlinger, Katja S

Full Text:PDF

GTID:1458390008981002

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Discovery of a new drug involves screening large chemical libraries to identify new and diverse active compounds. Only a very small percentage of the compounds in the library are active. Naive screening approaches of testing all compounds in the library are not desirable since in addition to being expensive, they provide little information on what aspects of the chemical structure of active compounds are related to activity.; This work investigates pooling experiments as one possible approach of improving screening efficiency and gaining insight into the structure-activity relationships. Four different pooling designs are proposed using two design criteria, optimal coverage of the chemical space and minimal collision between compounds. We evaluate each method by determining how well the design criteria are met and whether the methods are able to find many diverse active compounds. One pooling design emerges as a winner, but all designed pools clearly outperform randomly created pools. Furthermore, different analysis approaches of the pooling designs are investigated. Multiple trees are compared to model-based likelihood approaches with different covariate class definitions. Results show that a model-based likelihood approach with a multiple-trees-lower-bound covariate class definition gives the best performance. Another possible approach of improving screening efficiency and gaining insight into the structure-activity relationships is the use of data mining techniques such as RandomForest and ChemTree. These techniques are applied to individual compounds.

Keywords/Search Tags:

Screening, Compounds, Data, Pooling

PDF Full Text Request

Related items

1	Research On Raster Image Processor Based On Pre-Function Library Screening Algorithm
2	Research On Resampling Methods Of Imbalanced Data Based On Data Screening
3	Researches On Dynamic Screening System
4	Design Of Reliability Screening Scheme For Electronic Components
5	Research On Data Storage And Data Screening Based On Large Data
6	Prediction of chemical properties and biological activities of organic compounds from molecular structure and use of pattern recognition techniques for the analysis of data from an optical sensor array
7	Feature Screening For Ultrahigh Dimensional Discriminant Analysis With Mixed Data
8	Air Traffic Control Personnel Accountability Management System Based On Multi-level Data Screening
9	Analytical Studies Of Large-scale Virtual Screening Data Based On Hadoop
10	Research On Data Screening And Transaction Sorting Service For Blockchain Integrated Internet Of Things