Font Size: a A A

CUP Classification Based On MiRNA Feature Selection And SVM Classifier

Posted on:2015-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:2268330428985368Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
To deal with the low sensitivity problem in identifying the cancer origin with miRNA in the previous research, we adopt a tree structure based model for identifying a variety of Cancer of Unknown Primary (CUP) origins in this paper. In order to improve the sensitivity of classification, we use the miRNA feature selection and classifier based method at each node of the tree structure.This research includes two main tasks. The first is miRNA feature selection. Feature selection is an important way to improve the performance of a classification problem:removing the miRNAs with weak distinguishing ability can improve the overall accuracy, and removing the low importance miRNAs can accelerate the speed. So we did the miRNA feature selection before classification and chose three feature selection methods which are associated with information gain to select miRNAs:the evaluation methods are Information Gain Ranking Filter, Gain Ratio Feature Evaluator and CFS Subset Evaluator, and the corresponding ranking method or search strategy is Ranking and Best First. They are all provided by Weka. We use these methods to select some effective miRNAs, combine them with the results of previous research and then find several most useful miRNAs. We then perform the layer-by-layer classification based on the expression levels of these selected miRNAs.The second is the classifier selection. For a classification problem, the design of the model and the choice of classifier are quite critical. We take an existing tree structure as the basic framework in this work, and then the rest task is to find a classifier with prominent performance for binary classification at each node of the tree structure. Through the discussion of C4.5decision tree, CART decision tree, KNN classifier and SVM classifier, we find that the polynomial kernel SVM classifier is the best, and its robustness and universal applicability for binary classification problem makes it more suitable to deal with the problem here.The whole experiment is based on Weka. The final total sensitivity of test set is87%, which has a1%improvement compared with the task of Rosenfeld etc. Moreover, the sensitivity of CUP in the test set increased from77%to86%, a significant increase of9%. We used10-fold cross validation for parameter selection of the whole model. By comparing the final sensitivity of test set with the cross validation sensitivity of train set, we found that these two sensitivities are very close, this indicates that the CUP classification method based on miRNA selection and classifier is reliable and not overfitting.
Keywords/Search Tags:MiRNA, SVM, CUP, Feature Selection, Classification
PDF Full Text Request
Related items