Font Size: a A A

The Research Of Feature Selection Algorithms Based On Joint Symmetrical Uncertainty

Posted on:2018-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q HaoFull Text:PDF
GTID:2348330536960944Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
We are in a rapidly changing era,new technologies have been invented and new products have been developed.High-throughput genomics,proteomics and metabolomics technologies are widely used in cancer research.These technologies were used to get biological data to build better predictive models of diagnosis,prognosis and therapy,to identify and characterize key signals and to find new targets for drug development.These technologies present investigators with the task of extracting meaningful information from high dimensional data,wherein each sample is defined by thousands or tens of thousands of measurements,usually concurrently obtained.Data mining methods have been adopted to get useful information from the vast amounts of data.The feature selection technology as one of the main data analysis methods in data mining,is widely applied to the biological information data processing.Fast Correlation-Based Filter(FCBF)is an effective feature selection approach.Symmetrical Uncertainty and Approximate Markov blanket are applied to remove the irrelevant features and redundant features,respectively.But,the features which have the high correlation with class label(C-correlation)were preferred,and other features which have high redundancy with the preferred feature were removed by Approximate Markov blanket(AMB).As we know,the strong C-correlation features may not help to build a learning model very well,JSU-FCBF method was proposed.Approximate Markov blanket was used by JSU-FCBF to cluster features.And Joint Symmetrical Uncertainty(JSU)was adopted to pick the feature which has the strong joint discriminative ability with the selected features from each feature cluster.Experiment results of eight public datasets indicates the efficiency and effectiveness of JSU-FCBF.Genetic Algorithm(GA)is a metaheuristic inspired by the actions of natural selection and belongs to the larger class of evolutionary algorithms.These actions include selection,crossover and mutation,etc.According to the analysis of the relevancy of feature pairs with class label,a new feature selection method(JSU-GA)based on Joint Symmetrical Uncertainty and Genetic Algorithm is put forward.A pair of features which has strong associativity and high joint symmetrical uncertainty with class label was bound as an integral whole in the actions(selection,crossover,mutation)of JSU-GA.Eight datasets are used to demonstrate the superiority of JSU-GA.Experiment results show that JSU-GA could select a feature subset with higher classification ability and less features by binding features with strong associativity and high joint symmetrical uncertainty with class label.
Keywords/Search Tags:Data Mining, Feature Selection, Joint Symmetrical Uncertainty, FCBF, Genetic Algorithm
PDF Full Text Request
Related items