Font Size: a A A

Research Of Feature Selection Based On Evolutionary Algorithms

Posted on:2018-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:B LiuFull Text:PDF
GTID:2348330539985367Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Feature selection is a process that selects a subset from the original feature set according to some criteria.The goal of feature selection is to reduce the computational complexity of learning algorithm,and to improve the performance by removing irrelevant and redundant features.Feature selection is an effective approach to deal with the problem of dimension disaster,and it plays a key role in machine learning.Accordingly,it is very meaningful in theory and valuable in practice to investigate the problem of feature selection,especially for machine learning in the era of big data.In order to deal with the problem of discrete value feature selection,based on evolutionary computation,two feature selection methods are proposed in this thesis.The first employs relative classification information entropy as fitness function to measure the significance of feature subset.Furthermore,it has been proved theoretically that this measure is feasible.This approach uses genetic algorithm and particle swarm optimization,to search the optimal feature subset.The second method is similar with the first one,it use inconsistency rate as fitness function to measure the significance of feature subset.The two proposed feature selection methods are experimentally compared in this thesis,and obtained the following conclusions:(a)The method which use particle swarm optimization to search the optimal feature subset outperforms the one which use genetic algorithm to search the optimal feature subset in testing accuracy and convergence rate,when same fitness function is applied to measure the significance of feature subset.(b)The feature selection method which uses relative classification information entropy as fitness function outperforms the one which employs inconsistency rate as fitness function,when different fitness functions are used to measure the significance of feature subset.In addition,this thesis also investigates the extension of the proposed methods to the scenario of continuous value.The proposed algorithm has three characteristics:(1)It is simple and easy to implement;(2)The feature set selected by the proposed algorithm is much more suitable for describing the objects than others;(3)It has good semantic interpretability.
Keywords/Search Tags:Feature selection, Genetic algorithm, Particle swarm optimization algorithm, Relative classification information entropy, Inconsistency
PDF Full Text Request
Related items