Font Size: a A A

Research And Application Of Feature Selection Method Based On Heuristic Optimization

Posted on:2021-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:J WuFull Text:PDF
GTID:2428330602982515Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,we are entering the era of big data,and big data is exploding in all fields."Massive" not only represents a large volume of data,but also a high dimensionality of data.How to extract truly valid information from large amounts of data is the subject of data mining and machine learning research.Feature selection is one of the main research directions,the core task of which is to select an effective subset of low-dimensional features relevant to the processing task from the high-dimensional data feature set.In this paper,two improved algorithms are proposed to address the problem that a single feature selection algorithm cannot balance operational efficiency and accuracy when processing data.Funded by Zhejiang Provincial Natural Science Foundation,the main research work supported the research in this paper and results are as follows.(1)A two-stage feature selection fusion algorithm with automatic parameter optimization is proposed to address the problem of poor performance when processing high-dimensional data using filtered feature selection or encapsulated feature selection algorithms alone.The Maximal Information Coefficient is first introduced and the features are primed according to the feature-category attribute correlation,and then the redundant features in the remaining features are further removed based on the Pearson Correlation Coefficient.Finally,a feature selection fusion algorithm for the automatic optimization of parameters based on the genetic algorithm is constructed for the automatic optimization of two hyper parameters in the two aforementioned feature selection processes.The fusion algorithm combines the advantages of the encapsulated feature selection algorithm with high critical feature recognition capability and the filtered feature selection algorithm to quickly filter out features related to the target category,effectively reducing the number of dimensions of the feature set while ensuring that the classification accuracy of the acquired subset is within an acceptable range.(2)In response to the limited search capability of a single heuristic algorithm,this paper combines Whale Optimization Algorithm and Simulated Annealing to propose an encapsulated feature selection algorithm based on hybrid optimization.The algorithm selects the Max-Relevance and Min-Redundancy criterion as the evaluation criterion for feature selection,first introducing the Whale Optimization Algorithm to perform a more exhaustive search of the entire feature space,and then improving the optimal solution obtained by the Whale Optimization Algorithm in each iteration by Simulating Annealing.The Whale Optimization Algorithm in this algorithm is used to target the region with the most likely global optimal solution and Simulated Annealing for efficient local search,the combination of the two together improves the search efficiency of the feature selection algorithm.(3)Based on the Qt application development framework,a visual manipulation software for feature selection algorithms is built.The software has the functions of data set import,parameter setting,classifier selection,result display,etc.
Keywords/Search Tags:Feature Selection, Maximal Information Coefficient, Heuristic Optimization, Whale Optimization Algorithm, Simulated Annealing
PDF Full Text Request
Related items