Font Size: a A A

Researches On Gene Selection Algorithm With Support Vector Machine

Posted on:2011-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:W YouFull Text:PDF
GTID:2120360308469306Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the quick development of gene microarray technology, researchers can measure the expression of thousands of genes data rapidly. In the research of cancer, the gene microarray data make a new way to deal with disease diagnosis, cancer therapy and cancer prediction. However, this original gene microarray data contain thousands of genes with a small number of samples, which makes it difficult to analyze and dispose the data. So it is necessary to select the discriminate gene subset from the original gene data, which can improve cancer diagnosis accuracy.The traditional statistical methods encounter limitations in the process of gene selection. Based on statistical learning theory, Support Vector Machine (SVM) can solve the small-sample problems well by using the structural risk minimization principle. Meanwhile, SVM can work out the nonlinear problems by employing kernel function. Because of the above advantages, SVM shows up greater adaptability and better performance among the algorithms of gene selection.This thesis makes the research on gene selection based on SVM. The main works in this thesis are introduced as follows:1. The production, characteristics and applications of the gene microarray data are introduced. The theory of SVM is briefly analyzed, and the SVM-RFE gene selection algorithm is investigated in details.2. The sequential forward selection is introduced into the SVM-RFE algorithm. We do the recursive feature elimination and sequential forward selection at the same time in groups. The method can speed up the SVM-RFE algorithm and get the higher capability.3. The adaptive strategy to select the kernel parameter of SVM is studied. The proposed algorithm sets up the initial parameter value using the 2-norm distance between the samples, and then updates this value automatically according to the changed dataset caused by the recursive feature elimination.4. The admixture Multi-SVMs model is proposed. Firstly, Multi-SVMs with different parameter values are used to do the work of gene selection. Secondly, the gene subsets resulting from the above step is united. At last, SVM-RFE is adopted to get the best gene subset from the united subset. The algorithm resolves the problem of selecting parameter value and can get higher classifying accuracy via selecting a group of parameter values instead of only one value.Experimental results with the three public datasets demonstrate that the proposed algorithms can obtain the better performance.
Keywords/Search Tags:Gene Microarray, Gene Selection, Support Vector Machine, Sequential Forward Selection, Kernel Methods
PDF Full Text Request
Related items