Font Size: a A A

The Study And Application Of Feature Selection Algorithms Based On Relief

Posted on:2014-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:X L LiFull Text:PDF
GTID:2248330395499645Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the rapid development of contemporary science and technology, mankind entered the era of information explosion. Data mining technology reveals the implied information in the data, and transforms the magnanimity high-dimensional data into useful information and knowledge. Feature selection is an important direction of data mining. The feature selection algorithm reduce the number of features by removed the irrelevant features and redundant features, and achieve the aim of improvement of model accuracy, and it can also reduce the running time of the algorithm. On the other hand, by selecting out the relevance features, the model can be simplified and the make the researchers easily understand the process of data generation.Relief is an effective feature selection algorithm. Differs with the ReliefF algorithm, Multi-Relief runs Relief algorithm on two-class samples which are got by the multiple times of random sampling, and extends it from two-class to multi-class problems. As one time of sampling can extract only two class from the data, and the selected samples are not representative enough. In order to ensemble the sampling result effectively and accurately measure the weights of the features. This article first proposed an improved Multi-Relief multi-class feature selection algorithm, the algorithm divides weight vectors into groups, and deletes the positive weight when their occurrence frequency are below the threshold, and come into being the new weight ensemble methods. The experiments in3liver datasets and3public datasets showed that in most of the data the improved algorithm’s average accuracy are higher than the comparison algorithms. This article also applied the ReliefF-RFE algorithm into the biological dataset. By analyzing and researching the SVM-based ReliefF-RFE algorithm, this experiment displaced the SVM classifier into kNN, and batch or single deleting the features on the basis of the ReliefF result. This article has carried on the multiple sets of experiments to compare the improved ReliefF-RFE algorithm with the classic ReliefF algorithm in2liver datasets and6high dimensional biological public datasets,and the experimental results showed that the data the improved algorithm’s average accuracy are higher than the comparison algorithms. This article discussed and analyzed the Relief-family algorithm from two different ways, and proved to be effective under the10times of10fold cross validation classification accuracy.
Keywords/Search Tags:Feature Selection, Relief Algorithm, Recursive Feature Elimination
PDF Full Text Request
Related items