Font Size: a A A

The Application Of Feature Selection In Gene Expression Data Analysis

Posted on:2012-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z H YangFull Text:PDF
GTID:2178330335985899Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Feature selection is one of hot topics in the pattern recognition currently, especiallyin the field of bioinformatics. DNA microarray technology allows for measurement theexpression levels of thousands of genes simultaneously, resulting in a vast pool of geneexpression data. Gene expression data typically has a high dimension and a small samplesize. Thus, in order to analyze gene expression data correctly, feature selection is fairlycrucial. It is a challenging topic how to sort out those disease-related genes.In this paper, we propose an improved classification algorithm and a novel geneselection algorithm. The both are based on binary decision tree. Contributions in thispaper mainly include:1. The improved classification algorithm based on binary decision tree. We proposean improved classification algorithm based on binary decision tree (CABDT) by integrat-ing ID3 (Iterative Dichotomiser), C4.5 and CART (Classification and Regression Trees)algorithm. In order to reduce the e?ect of data noise on the classification experimentalresults, a postpruning technique is implemented by the empirical risk and we propose apostpruned classification algorithm based on binary decision tree (P-CABDT).2. The application of feature selection in gene expression data analysis. Gene ex-pression data typically has a high dimension and a small sample size. Thus, featureselection (namely gene selection) is fairly crucial for gene expression data analysis. Wepropose a novel gene selection algorithm based on binary decision tree (GSABDT) forthe gene expression data. The newly proposed algorithm belongs to the embedded meth-ods. And it eliminates gene redundancy automatically and yields a very small number ofcancer-related genes, resulting in reducing the solution size of classification problem.
Keywords/Search Tags:Feature selection, Binary decision tree, Classification method, Gene ex-pression data
PDF Full Text Request
Related items