Feature selection and statistical alternatives for machine learning applied to in-silico drug design

Posted on:2003-08-13

Degree:Ph.D

Type:Dissertation

University:Rensselaer Polytechnic Institute

Candidate:Arciniegas, Fabio Andres

Full Text:PDF

GTID:1468390011480674

Subject:Operations Research

Abstract/Summary:

Feature selection has recently been the subject of intensive research in data mining, especially for datasets with a large number of descriptive attributes such as QSAR (Quantitative Activity Structure Relationship) data. QSAR is an in-silico drug design methodology, which requires identifying important features of molecules that explain a relevant drug property. A typical QSAR dataset for predicting an activity of interest is characterized by a large number of descriptive features (300–1000) for a relatively small number of compounds (molecules).; Finding the best feature subset for a given problem with N number of features requires evaluating all 2N possible subsets. The best feature subset also depends on the predictive modeling, which will be employed to predict the future unknown values of response variables of interest. Feature selection involves minimizing the number of relevant features for maximizing the predictive power of the model. From this point of view feature selection can be viewed as a special type of multi-objective optimization problem.; This dissertation proposes machine learning algorithms as predictive modeling tools for QSAR problems, and develops a novel approach for feature selection based on feature saliency. In addition, this approach is computationally less expensive than other machine learning feature selection methods (i.e., weight pruning for ANNs), and it works for any nonparametric regression algorithm.

Keywords/Search Tags:

Feature selection, Machine learning, Drug, QSAR

Related items

1	Machine Learning Based Adverse Drug Reaction Extraction From Text
2	Study On The Structure - Activity Relationship Model Of Heavy Metal Deposited By Modified Wheat Straw Based On Machine Learning
3	Drug Name Recognition And Drug-Drug Interaction Extraction Based On Machine Learning
4	Drug-Drug Interaction Extraction And Drug Warning For Chronic Iseases Via Machine Learning
5	Research And Application Of Integrated Feature Selection Algorithm Based On Extreme Learning Machine
6	Research On Feature Selection For Machine Learning
7	Kernel methods in computer-aided constructive drug design
8	Research On Two-Stage Feature Selection Methods In Machine Learning
9	Research On Influencing Factors And Forecast Of Drug Sales In Retail Pharmacies Based On Machine Learning
10	Feature Selection And Its Application In Classification