Font Size: a A A

Dimensionality Reduction For Gene Expression Data Classification

Posted on:2013-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z L LiFull Text:PDF
GTID:2230330371494345Subject:Precision instruments and machinery
Abstract/Summary:PDF Full Text Request
DNA microarray has produced revolutionary effects on bioinformatics andbiomedicine. By using this technique, researchers can extract a mass of biologicalinformation conveniently from DNA or RNA to obtain gene expression datasets. Due to thedatasets are high-dimensional and sparsely sampled, researchers often encounter difficultyof “Dimensionality Curse”. As a result,feature extraction of gene expression data hasbecome one of the core research contents in data mining and classification.The aforementioned feature extraction actually is to extract the features as linear ornon-linear functions of the original set of features. After extracting the essential features ofthe original data, we can remove the redundant or irrelevant information and discover theunderlying structure.In this paper, two well-known gene expression datasets,the GCM data and Lymphomadata, are chosen as the experimental material of data classification. In order to settle“Dimensionality Curse”, this paper adopts dimensionality reduction such as PCA, LTSA,MDS and Laplacian Eigenmaps to extract the features of gene expression data, and usesRBF-SVM to classify the datasets. The results of experiments indicate that SVM is suitablefor small sample machine learning, and dimensionality reduction as feature extractionapproach greatly optimizes the accuracy rate of data classification.
Keywords/Search Tags:DNA microarray, gene expression data, dimensionality reduction, Support Vector Machine
PDF Full Text Request
Related items