Font Size: a A A

Research On Feature Selection Algorithm Based On Multimodal Evolutionary Computation

Posted on:2022-08-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L WangFull Text:PDF
GTID:1488306326494464Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Most of the real-world classification problems consist of a variety of raw features,some of which are irrelevant or redundant.This will not only degrade the classification speed and disturb the learning process,but also decrease the classification performance.Therefore,Feature Selection(FS)as a technique that can effectively eliminate irrelevant or redundant features to boost the classification performance and efficiency have been widely and successfully applied to many real-world applications.However,traditional FS methods such as Filter,Wrapper,Embedded,and Hybrid have some disadvantages like nesting effected,difficult to set the key parameters,and easier to be trapped into local optimum.Evolutionary Computation(EC)has been widely and successfully applied to FS,given its promising local/global search ability.With the development of techniques used to collect the data,the dimension of the features increases significantly and the correlations among features become more complex,resulting in multimodal problems for the investigated objectives,which will lead to new challenges in FS based on EC.By investigating the characteristics of FS problems,this thesis mainly focuses on studying and improving the performance of evolutionary multimodel FS methods,to obtain the best classification performance with the minimal number of features.To accurately and stably extract the feature subsets that are highly relevant to the classification target,it is necessary to comprehensively investigate the FS methods and design new methods to address the challenges in existing works.This thesis studies the feature selection algorithm based on multimodel evolutionary computation from the number of FS objectives,search mechanism,etc.The core works are summarized as follows:1.A feature selection algorithm based on co-evolution two-stage decomposition(CCFS/TD)is designed to solve the problems related to irrelevant and redundant features,and high computational cost in high-dimensional large-scale data.The algorithm combines two-stage decomposition strategy and coevolution technique to reduce data dimension and uses differential evolution algorithm to search the feature subset.To solve the problem that a single evolutionary algorithm or coevolution retains a large number of features when dealing with high-dimensional problems,a new decomposition strategy is designed.It decomposes the traditional evolutionary process into several successive evolutionary processes step by step by randomly shuffling the order of feature dimension.Features can form different combinations in each subevolution process,so that information can be exchanged among different features.Comparisons of experiments on different types of high-dimensional data sets show that CCFS/TD algorithm can effectively reduce feature dimensions and select an optimal feature subset to obtain better classification performance.2.A ensemble feature selection algorithm based on fitness Euclidean-distance ratio differential evolutionary(EFS_FERDE)is proposed to find multiple global or local optimal feature subsets.The classification model based on a single feature subset is prone to overfitting,which affects the generalization performance of classification.To avoid this problem,the FERDE multimodal optimization algorithm is adopted as the feature subset search method to obtain multiple optimal or suboptimal feature subsets with large difference degree and high classification accuracy.All trained classifiers over the obtained subsets are combined for ensemble using voting to improve the classification generalization.Experimental results show that EFS_FERDE algorithm has good stability and generalization performance.3.In FS problems,decision makers expect to use the least features to obtain the most satisfied classification accuracy,but these two objectives are conflicting and can't be achieved simultaneously.Therefore,the FS problem can be regarded as a multiobjective optimization problem to find the tradeoff solution set between two obj ectives(i.e.,the least features and the highest classification accuracy).However,the tradeoff solution set may have different combinations that have the same number of features and classification accuracy.A feature selection(FS_FERDE_MMO)algorithm is designed to solve multimodal and multiobjective problems in FS.In FS_FERDE_MMO,the non-dominant solution is applied to FERDE to search the feature subsets and find multiple Pareto optimal sets.The obtained optimal sets can be used to choose the least features with high accuracy for the decision makers.
Keywords/Search Tags:Classification, Evolutionary Computation, Feature Selection, Cooperative Coevolution, Multimodal Multi-objective Optimization
PDF Full Text Request
Related items