Font Size: a A A

Gene Microarray Data Classification Based On Tolerance Rough Sets

Posted on:2014-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:P WangFull Text:PDF
GTID:2248330398950716Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Rough set theory is a kind of effective tool for analyzing imprecise, inconsistent and incomplete data. However, classical rough set model based on equivalence relation can only fit for discreted data, which has great limitations. Rough set model with tolerance relation knowledge representation can handle numerical type data easily. A reasonable classification method is needed urgently for high-throughtput gene microarray data with high dimension and samll sample. Using rough set theory to research the classication problem on gene microarray data has been another hotspot of bioinformatics.Constructing rough set model from two different perspectives of approximation and neighborhood’s access methods, and proposing an attribute reduction algorithm based on forward deletion policy to solov gene selection problem. This paper defines approximation based on set and object’s neithborhood using intersecting neighborhood instead of the traditional access method based on point and distance neighborhood. It makes certain concept’s approximation more accurate and has great similarity among dimensions. Results on eight data sets show that it should choose different rough set model to construct knowledge for different data sets, rough set model with set approximation and intersection neighborhood can adapt for most of the gene microarray data. Finally, this paper further illustrates the validity of this method by referring to existing gene annotations and statistical contrast.This paper achieves object’s classification through a rule-based classifier. First, this paper discretizes the data sets after gene selection using frequency interval method. Then, the concept of knowledge granular is defined, building granular’s center set and neighborhood set through the strategy of merging adjacent interval, changing the past rule extraction pattern only based on center set or neighborhood set. Finally, extracting rules and building classifier using rule induction algorithm this paper proposed. Experimental results show that rule induction algorithm based on dually verifying, can not only extract a low error rate rules set, but also construct a higher accuracy classifier.Generally, this paper solves the classification problem of gene microarray data using tolerance rough set mode and relevant algorithms successfully. Experiments on animal, plant and simulated data show that our method achieves better performance both in the quality of selected genes and the classifier’s ability.
Keywords/Search Tags:Rough Set, Classification, Gene Microarray, Gene Selection
PDF Full Text Request
Related items