Font Size: a A A

Research On Feature Gene Selection Method Based On Fuzzy Neighborhood Entropy

Posted on:2024-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:M J SunFull Text:PDF
GTID:2530307067472994Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Abnormal gene expression will lead to the generation of tumor,threatening human life and health safety.In biomedicine,gene expression data can approximately reflect gene expression,but there are a large number of redundant genes or genes that are not related to tumor pathogenesis.Therefore,based on gene expression data,this thesis focuses on the accurate characterization of characteristic genes and the measurement of correlation with tumor classification,so as to select the characteristic genes that contain the most information and are associated with tumor pathogenesis from thousands of genes.The main research contents are as follows:(1)In this thesis,parameterized fuzzy similarity relation is introduced to granulate the continuous data in the fuzzy neighborhood information system,in order to preserve the original information of the continuous data as much as possible,and a fuzzy neighborhood joint entropy model is constructed.Firstly,fuzzy neighborhood particles and fuzzy decisions of samples were constructed by using fuzzy similarity relationship,neighborhood radius and decision equivalence class to accurately characterize gene expression data;Then,based on the proposed model,the definition of fuzzy neighborhood joint entropy,its non-negative principle and other theorems are given,and the corresponding feature gene selection algorithm is designed for the model.Finally,the importance of candidate feature genes was evaluated by the non-negative principle of the algorithm,so as to select the subset of feature genes that contain the most information,can best express the biological information of the original dataset and have the same or even higher classification performance as the original feature set.In order to tolerate the noise in the data,the parameter setting was analyzed and discussed.Compared with the existing correlation algorithms,this method has better performance in the number of characteristic genes and the classification accuracy of tumor genes,which provides a new method for the prediction and diagnosis of tumor incidence.(2)In the process of characteristic gene selection,different entropies affect the selection of characteristic genes in different ways.Therefore,a new model,fuzzy neighborhood cross entropy model,is constructed on the basis of work(1).Firstly,the fuzzy neighborhood grain and fuzzy decision between samples were used to accurately characterize gene expression data,and the original information of gene data was preserved to the maximum extent.Then,based on the proposed model,the definition of fuzzy neighborhood cross entropy and the proof of non-negative principle and other theorems are given,and the selection of parameters to tolerate noise in data is analyzed and discussed.Finally,a fuzzy neighborhood cross entropy algorithm is designed for the proposed model to effectively select the feature genes.The experimental results showed that this method can effectively eliminate irrelevant and redundant features and improve the classification accuracy of tumor genes to a certain extent.
Keywords/Search Tags:Fuzzy similarity relation, Fuzzy neighborhood joint entropy, Fuzzy neighborhood cross entropy, Feature gene selection, Uncertainty measure
PDF Full Text Request
Related items