Font Size: a A A

Feature Selection Model Based On Rough Set And Improved Cuckoo Search Algorithm And Its Application In Cancer Detection

Posted on:2023-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:C X ZhangFull Text:PDF
GTID:2544306920989359Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,data has the characteristics of large scale,many categories,high dimensionality and small samples,so that there is a lot of redundant and irrelevant information in the feature space.These redundant and irrelevant information cause the learning process of the entire algorithm to become slow and increase the computational burden.At the same time,the classifier learning in this case also faces the situation of overfitting.Therefore,the screening of feature subsets is an indispensable part of data processing.Feature selection is a data processing method that selects a subset of features with good classification ability from a specified feature set according to a specific rule.However,feature selection has the disadvantages of poor subset optimization effect and low classification accuracy when processing a large amount of data.Therefore,the combination of Swarm Intelligence algorithm and Rough Set to build feature selection model has become one of the common research hotspots in the field of feature selection.Aiming at the deficiency of Cuckoo Search algorithm,this study constructed a new feature selection model combining with the related characteristics of rough set,and applied it to cancer detection.Specifically,this research mainly includes the following three aspects:Firstly,the CS algorithm has low convergence accuracy,lack of information exchange within the population and poor local search ability.This study proposes an improved CS algorithm based on nonlinear inertia weight and Differential Evolution Algorithm.This study proposes two strategies to improve the performance of the standard CS algorithm.On the one hand,WCSDE adopts an inertial weight factor that linearly decreases with the number of iterations,and an update method that optimizes the nest position,so as to enhance the balance between exploration and development capabilities and strengthen local optimization capabilities.On the other hand,mutation and cross selection mechanism of DE algorithm are introduced to compensate for the lack of information interaction between populations,avoid the loss of effective information and improve the convergence accuracy.In the experimental part,13 classical benchmark functions were selected from 4 CS variants,such as standard CS and WCSDE,to perform the function optimization task,and the effectiveness of the proposed algorithm was verified from the two evaluation criteria.The results and statistical analysis show that the algorithm has better local search ability and stronger robustness.Secondly,in view of the large amount of data and the low classification accuracy of feature selection,a feature selection model based on rough set and improved binary cuckoo search algorithm is proposed in combination with rough set theory.Firstly,in order to enhance the searching ability of cuckoo algorithm,the idea of mutation cross selection in DE is integrated.Then the new bird’s Nest update mechanism looks for high-quality features to improve the effect of feature selection.Finally,a suitable fitness function was constructed based on rough set.In order to test the effectiveness of the proposed algorithm,three different classifiers were selected from 16 data sets of UCI database to conduct experiments on the proposed algorithm.Friedman test and Nemenyi follow-up test were used to test the performance of the results.The results show that the average classification accuracy of the proposed algorithm is 88%,and it has more advantages in feature selection than other algorithms.Finally,according to the advantages of IBCSRS algorithm model,it is applied to cancer detection and compared with the standard Binary Cuckoo Search algorithm.Through detailed analysis of experimental results in many aspects,it is found that the accuracy of the proposed algorithm in lung cancer data set detection is 91%,while the accuracy of BCS algorithm is only 83%,which highlights that the algorithm in this study has more accurate performance in cancer detection.
Keywords/Search Tags:Feature Selection, Cuckoo Search Algorithm, Differential Evolution Algorithm, Nonlinear Inertial Weight, Rough Set, Cancer Detection
PDF Full Text Request
Related items