Font Size: a A A

The Feature Selection Based On Mutual Information And Decision Tree

Posted on:2020-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhouFull Text:PDF
GTID:2428330596986965Subject:Mathematics and probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Nowadays we have entered the new age of big data,it is necessary to extract useful information from the ultra-high data to solve problem.In order to avoid other irrelevant information interfering with research,we must reduce the dimension of data.There are many ways to reduce dimension,and feature selection is one of them.In this paper,a method GA-MAX-NMIFS of supervised feature selection combin-ing normalized mutual information and decision tree is proposed.Firstly,the can-didate data set is obtained by selecting variables in the original data set with max-nmifs method.Then the decision tree is used to find the sub-data set with the maximum classification accuracy in the c andidate d ata s et a s the p rospective d ata s et,a nd then the genetic algorithm is used to optimize the prospective data set to obtain the target data set.Finally,the GA-MAX-NMIFS feature selection method was used to reduce the dimensionality of the simulated data set and 16 real data sets.The dimensionally reduced new data sets were used for classification and the results were compared with the initial data sets.The result shows that GA-MAX-NMIFS method can obtain good performance of classification and c an effectively choose the important features,mean-while it guarantees the accuracy of classification by comparing the effect of full feature classification.
Keywords/Search Tags:High-dimensional data, Feature selection, Normalized mutual information(NMI), Decision tree
PDF Full Text Request
Related items