Font Size: a A A

Research On The Cancer Microarray Data Feature Selection Method Based On Krill Herd Algorithm

Posted on:2021-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:J C HouFull Text:PDF
GTID:2404330605454309Subject:Engineering
Abstract/Summary:PDF Full Text Request
A large number of valuable microarray data has been rapidly accumulated as gene chip technology is widely used in cancer research.Using these data for cancer marker mining is a research focus in the field of bioinformatics.Microarray data provides great convenience for the diagnosis,classification,pathogenesis research and rapid drug development of cancer at the molecular level.Therefore,it is of great significance to the research significance and application value for early diagnosis and treatment of cancer.However,these data have the characteristics of "high-dimensional small samples",in which most genes are noise or redundant.It is impossible for medical experts to directly analyze them in a short time.If these data are directly modeled and processed through data analysis algorithms,too many noise and irrelevant features will greatly reduce the performance of the algorithm,increase the computational complexity,and cause a "dimensional disaster".Feature Selection(FS)is the most effective method to solve those problems.Feature selection has been used as an effective method to reduce the dimension of data,and it has attracted more and more attention in the field of biomedicine.The feature selection method based on wrapper utilizes the classification performance of the feature subset to obtain the best feature subset,which has been widely concerned because of its higher classification accuracy and flexibility.The searching algorithm is the most important part of the wrapper method,which has a great impact on the performance of the method.The meta-heuristic algorithm based on the population is usually used as the search algorithm for the wrapper method.The Krill Herd algorithm is an efficient search algorithms proposed in recent years and it has been widely used in economic load distribution,training neural networks and network optimization.This paper improves the Krill Herd algorithm and applies it to the feature selection of cancer microarray data.The main research results are as follows:(1)In order to solve the problem that the Binary Krill Herd(BKH)algorithm has insufficient ability to search feature subset in microarray data feature selection,and it is easy to fall into local optimization and premature convergence,we propose an improved Binary Krill Herd algorithm,naming IGMBKH.First,the IGMBKH algorithm selects some top-ranking features based on the Information Gain(IG)value to guide the construction of the initial solution to obtain a better initial population.Then,in the MBKH algorithm searching phase,in order to facilitate the exploration and exploitation of the MBKH algorithm to further enhance the searching ability of BKH,we introduce the Chaos Memory Weight Factor into perators in the MBKH.Finally,in order to avoid falling into local optimality and premature convergence,the MBKH algorithm introduces the Hyperbolic Tangent Function and Adaptive Transfer Tactor.Experimental results of in six cancer microarray data show that compared with BKH,MBKH,other classic and state-of-the-art feature selection algorithms,the proposed IGMBKH algorithm can achieve higher classification accuracy with fewer features.In the feature selection of cancer microarray data,the IGMBKH algorithm can carry out deeper search and has a stronger searching ability.Therefore,it can be used as an ideal preprocessing tool,which could effectively reduce the dimension of high-dimensional microarray data and better mine the features of cancer data.(2)The IGMBKH algorithm has strong competitiveness in seeking feature subset,but relatively speaking,it has the disadvantage of slow convergence speed,while the Binary Black Hole Algorithm(BBHA)is the few algorithms with fast convergence speed.In view of the complementarity between the two algorithms,this paper proposes a hybrid krill black hole algorithm,IGMBKH-BBHA,which can adaptively divide two populations.The IGMBKH-BBHA algorithm uses the adaptive partition rules to control the number of different individuals in the population to dynamically adjust the relationship between the MBKH algorithm and the BBHA to achieve the purpose of complementing the advantages of the two algorithms.Experimental results show that the IGMBKH-BBHA algorithm has a certain improvement in convergence speed and classification performance compared to the krill algorithm alone and the black hole algorithm.We tested and discussed the impact of three different filter methods to initialize MBKH-BBHA population on algorithm performance.Finally,the performance of the hybrid algorithm in screening biomarkers is further verified by genetic analysis of the selected features.This method has certain reference significance in screening biomarkers,and can provide new and valuable information for related researchers to study the relationship between cancer and genes.
Keywords/Search Tags:Feature Selection, Krill Herd Algorithm, Information Gain, Black Hole Algorithm, Adaptive partition rule
PDF Full Text Request
Related items