Font Size: a A A

A Two-stage Hybrid Ant Colony Optimization Algorithm For High-dimensional Feature Selection

Posted on:2022-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:X B ZhouFull Text:PDF
GTID:2518306605965999Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The “curse of dimensionality” issue caused by high-dimensional datasets increases the computational memory and running time and leads to a decrease in the classification performance of learning methods.The feature selection technology reduces the data dimension by eliminating redundant and irrelevant features,thereby improving the learning algorithm's performance.However,in feature selection,finding the optimal feature subset is an NP-hard problem.And it is easy to fall into a local optimum using the traditional greedy search method.The ant colony optimization algorithm in the swarm intelligence algorithm is widely used in feature selection due to its excellent global and local search capabilities and flexible graph representation.However,the current feature selection methods based on the ant colony algorithm are mainly applied to low-dimensional datasets.For thousands of dimensional datasets,the search for the optimal feature subset becomes extremely difficult due to the exponential increase of the search space.In this paper,we propose a two-stage hybrid ant colony optimization algorithm for high-dimensional feature selection.The specific research content is as follows:(1)For the two commonly used feature search space representations based on ant colony algorithm,experimental analysis confirms that the fully connected representation method is more suitable in high-dimensional feature selection.In addition,the inherent correlation properties between features are used to speed up the search for the optimal feature subset,and the classifier evaluation is used further to improve the classification performance of the feature subset.Experiments show that the hybrid ant colony algorithm has better results than the single model method.(2)It is difficult to determine the number of selected features based on people's prior knowledge of high-dimensional datasets.The proposed two-stage ant colony algorithm uses the interval strategy to determine the size of the optimal feature subset for the following search.Compared to the traditional one-stage methods that determine the size of the optimal feature subset and search for the optimal feature subset simultaneously,the stage of checking the performance of partial feature number endpoints in advance helps to reduce the complexity of the algorithm and alleviate the algorithm from getting into a local optimum.The test results on eleven high-dimensional public datasets show that the feature subset obtained by the proposed method has the best performance on most datasets.Compared with other feature selection methods based on ant colony algorithm,the running time is also shorter.(3)Aiming at the problem that there is almost no mutual learning between ants but only iterative information learning,we use the crossover operator in the genetic algorithm and the number of feature visits in the population to design a crossover operator based on feature selection.It can increase the algorithm's global searchability.And the classification performance of the selected feature subset is further improved.In addition,compared with traditional standard feature selection methods,the improved two-stage hybrid ant colony algorithm is more suitable for high-dimensional feature selection.
Keywords/Search Tags:Feature selection, ant colony optimization, high-dimensional dataset, optimal feature subset size, crossover operator
PDF Full Text Request
Related items