Font Size: a A A

Application Of AdaBoost In Gene Expression Data Classification

Posted on:2018-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:M X ChangFull Text:PDF
GTID:2348330536960930Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The relationships between gene expression profile and tumor have been studied extensively.Unsupervised learning methods such as cluster algorithm have been commonly used in functional genomics,but supervised learning methods can be directly applied to gene expression data with a class attribute.Gene data analysis and gene classification have become effective ways to diagnose tumor and cancers.But,some studies have been also shown that the simple classification model has its drawbacks in achieving accurate gene data classification and cancers diagnosis.Meanwhile,classifier ensembles have received more and more attention in gene data mining.In this article we address the gene data classification issue using AdaBoost algorithm which is an extension of the original Boosting algorithm.To select a proper subset of informative gene data,some different feature selection algorithms such as BeliefF,FCBF and PCA are considered.To assess the efficiency of the AdaBoost algorithm,other classification algorithm including Decision Trees,Support Vector Machines and Bagging are also deployed.In order to analyze the influence of the number of iterations better,the actual number of iterations of the algorithm is obtained by analyzing the toString function of the algorithm in WEKA.The 10-fold cross validation was performed on two real gene expression data sets.By adjusting the feature selection algorithm,the number of selected features,the type of weak classifier and the number of iterations optimization,obtain good experimental results on both gene expression data sets.These experimental results have revealed that AdaBoost algorithm is effective for gene classification.
Keywords/Search Tags:AdaBoost, Gene Classification, Feature Selection
PDF Full Text Request
Related items