| Tumor is a great threat to human health.The occurrence of tumors is due to abnormal gene expression,and gene expression profile based on microarray technology can approximately reflect gene expression.However,most genes in gene expression data are not related to tumorigenesis.Therefore,based on the gene expression profile data,this paper is dedicated to the research of fuzzy neighborhood rough sets gene selection methods,which provides new methods for tumor prediction and diagnosis.The main research contents are as follows:(1)Present methods pay little attention to the incomplete inclusion between the neighborhood of sample and decision equivalence classes in the process of gene expression data classification,which results in low classification accuracy of tumor gene.To solve the above mentioned problems,this paper proposed a variable precision tumor gene selection method based on fuzzy neighborhood mutual information.Firstly,a new entropy is defined to calculate uncertainty measure of fuzzy similarity relation.Then,the concepts of fuzzy neighborhood joint entropy,fuzzy neighborhood conditional entropy and fuzzy neighborhood mutual information are proposed,and the basic properties of these uncertainty measures are studied.Finally,we defined the internal and external attribute significance in fuzzy neighborhood information systems followed by evaluation of these candidate features with the defined attribute significance.Furthermore,a variable precision tumor gene selection algorithm based on fuzzy neighborhood mutual information is designed,and the algorithm is applied in attribute reduction of gene datasets.Compared with the existing algorithms,experimental results on six open access gene datasets,which are DLBCL,SRBCT,Leukemia1,9_Tumors,Leukemia and Brain_Tumor2,showed that the proposed algorithm can effectively remove noise and redundant genes resulting in a smaller size feature subsets with improved classification accuracy of gene datasets.(2)To solve the problem that the positive region may decrease and the classification accuracy may decrease with the increase of attributes in fuzzy neighborhood rough sets,a non-monotonic tumor gene selection method based on fuzzy neighborhood mutual information is proposed.First,the Fisher Score dimensionality reduction technology is used to preliminarily reduce the dimension of tumor gene expression profiles.Secondly,the coverage and the credibility are introduced into the fuzzy neighborhood information system,and the fuzzy neighborhood coverage and the fuzzy neighborhood credibility are proposed.Combined with the uncertainty measure of information entropy,the fuzzy neighborhood decision entropy,fuzzy neighborhood conditional entropy,fuzzy neighborhood joint entropy and fuzzy neighborhood mutual information are proposed.Then,the properties of these uncertainty measures and the relationship between them are studied.Finally,a fuzzy neighborhood mutual information based non-monotonic feature gene selection algorithm is designed and the performance of the algorithm is validated on four public access gene datasets,which are Colon,Leukemia,Brain_Tumor2 and Lung.Compared with the present algorithms,the proposed algorithm can greatly reduce the dimension of gene profiles. |