Font Size: a A A

The Research Of Potential Biomarkers Selection Algorithm Based On SVM-RFE

Posted on:2012-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q RuanFull Text:PDF
GTID:2218330368487995Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
We are in the era of information explosion. Data mining techniques are so important that it can reduce the workload to achieve a multiplier effect if we could find the key points from the vast data. If we find the regular patterns lying under the mass data by data mining techniques, we could use the patterns for predicting the occurrence of future events, such as agriculture, meteorology, earthquake and so on.Multivariate statistical analysis methods and machine learning, pattern recognition methods are popular techniques used in data mining. The feature selection method SVM-RFE which is based on support vector machines(SVM), is now one of the widely used methods for broad range of adhibitions and it has wonderful promising performance in the applications. The existence of noise has bad influence on the performance of SVM-RFE. In this paper a feature selection method which evaluate the features by using a filter method named ReliefF to assist SVM was proposed. In the experiment cross validation was used to verify the SVM-ReliefF-RFE method on real data set of live disease. By comparing with the original SVM-RFE, the experimental result shows that the average accuracy of our proposed method improved 0.66% at least, and 3.6% at most compared with SVM-RFE. This experiment used the proposed method found better optimal and more informational feature subsets than SVM-RFE. Then, this paper proposed another feature selection method called two-stage method. In the first stage, we use artificial variables for filtering the noise and the data irrelative with the problem, after that the new data are go to the next stage. In the second stage, SVM-RFE is used for feature selection. In the experiment, the average accuracy of our proposed method improved 1.74% compared with SVM-RFE, and the features the proposed method select have significant differences.The methods this paper proposed have improved the performance of original SVM-RFE by filtering the noise from two different points of view. It has been verified the superiority of our proposed methods by the analysis of selected feature subsets.
Keywords/Search Tags:Data Mining, SVM, SVM-RFE, ReliefF, Artificial Variable
PDF Full Text Request
Related items