Research On Grassberger Entropy Based Mutual Information Feature Selection

Posted on:2016-10-01

Degree:Master

Type:Thesis

Country:China

Candidate:M M Wang

Full Text:PDF

GTID:2308330461967285

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Mutual information or information gain is a good indicator of the relevance or dependence between variables and has been used in feature selection algorithm, which is called Mutual Information based Feature Selection algorithm(MI-FS). The performance of MI-FS algorithms depend on the accuracy of mutual information estimated from the finite observed data samples. However the common mutual information estimation procedures are biased. In this paper, we will introduce an improved plugin entropy estimator for discrete variables, Grassberger entropy estimates(GEE) and use it for entropy and mutual information estimation. In Decision Tree Induction area, it has been proved that GEE can lead to more accurate mutual information and more effective decision tree. By applying GEE based mutual information in the normalized mutual information selection algorithm(NMIFS), we improved NMIFS algorithm and named it G-NMIFS.We focus on G-NMIFS algorithm. To make a comparative analysis of the performance between G-NMIFS and NMIFS, we tested both algorithms on 6 different data sets including two-class and multi-class classification tasks: for each data set, we selected k(1â‰¤kâ‰¤50) features with G-NMIFS and NMIFS algorithm and regarded k features as a feature subset, we trained them with RBF kernel based support vector machine(SVM), evaluated them with cross-validation classification accuracy rates and performed wilcoxon signed-rank test. The experimental results shows that: on all the six data sets, (1) G-NMIFS feature subset leads to higher classification accuracy rate than that of NMIFS for most of the k numbers; (2) G-NMIFS feature subsets lead to consistently higher classification accuracy rates than NMIFS for some consecutive numbers k; (3) G-NMIFS leads to the feature subsets corresponding to the optimum classification accuracy rate; (4) G-NMIFS statistically outperforms NMIFS. In a word, G-NMIFS outperforms NMIFS. On the basis, we will integrate the feature selection algorithms into the EEG based stress monitoring system.

Keywords/Search Tags:

Grassberger entropy, mutual information estimates, NMIFS algorithm, feature selection

PDF Full Text Request

Related items

1	Research On Mutual Information Hierarchical Clustering Based On Grassberger Entropy Estimator
2	A Study On Feature Selection Algorithms Using Information Entropy
3	Study Of Feature Selection Method Based On Mutual Information
4	Feature Selection Research Based On Maximum Relevance Minimum Redundancy
5	Research On Feature Selection Algorithm Based On Mutual Information
6	Multi-label Feature Selection Based On Mutual Information
7	Research On Feature Selection Algorithm Based On Lasso And Mutual Information
8	Research On Dynamic Feature Selection Algorithm Based On Mutual Information
9	Research On Mutual Information Based Feature Selection Algorithm
10	Research On The Algorithm Of Feature Selection Based On Mutual Information For Text Categorization