Font Size: a A A

Discriminatory Gene Selection For Cancer Diagnosis Of Multi-class Situation With SVM

Posted on:2006-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:X JiFull Text:PDF
GTID:2144360152471662Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of biological technology, DNA microarray data provides precondition and probability for gene diagnosis and gene treatment. Gene selection is the foundation of gene diagnosis. Gene microarray data consists of more than thousands of genes and only a few numbers of samples, thus how to select those diagnostic genes from gene microarray data to ensure the validity and reliability of gene diagnosis is full of challenge.The methods of diagnostic gene selection for multi-class diseases based on SVM are studied in this paper. This paper started with discussing how to express the gene classification contribution in two types of diseases and presented four methods of gene selection for multi-class diseases as follows: (1) a gene selection method based on sum contribution which sums all of classification contributions that a gene classifies each disease-pair as the total classification contribution of this gene to classify the multi-class diseases; (2) from the margin between two class centers to consider gene contribution, present a gene selection method based on class-patterns; (3) a gene selection method based on correlation and contribution space; (4) a gene selection method based on correlation and sum contribution. These all methods are based on one-versus-one multi-class SVM to select genes. The methods (1) (2) do not require any restriction to the correlation between the selected genes, but (3) (4) restrict the Pearson linear corrlation between any pair of selected genes.The large numbers of experiments on real gene microarray datasets were conducted by these methods, for the data with 2308 genes in 4 types of diseases, the best selected gene subset had 7 genes; for the data with 4026 genes in 3 types of diseases, the best selected gene subset also had 7 genes. The selected diagnostic genes were small in number and nicer in diagnostic ability. In a word, the results show great effectiveness of the methods proposed in this paper.
Keywords/Search Tags:Gene Selection, SVM, Cross Validation, Sum Contribution, Contribution Space, Correlation
PDF Full Text Request
Related items