Font Size: a A A

Data Analysis Of Cancer Gene Expression Based On SVM-RFE Algorithm

Posted on:2013-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:S LiangFull Text:PDF
GTID:2248330371485168Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of science and technology, modern medicalstandard has also been developed deeply. Even though more and more unknown areascome to be clearly, there are still some areas we can not go deep into, such as theresearch of cancer. The research of cancer has also been developed a lot with theprogress of the research of human genome. But in face of the complex data, we mustto a more intelligent way to help us find out the useful data we need from the massivedatabase. And now, it has been the bottleneck of the cancer research. In suchcircumstances, the appearance of gene chip, the development of feature selectionmethod, and in-depth study of support vector machines, theory of statistical, andtechnology of data mining, all these provide a good support for modern research ofcancer.Support vector machine (SVM) is a new method of machine learning with agood ability of generalization, which can solve the dimension, small samples,nonlinear problems. It is the further development of the theory of Statistics. Thekernel idea of SVM is to make the machine learning adapt to sample training better.Difference from the past, SVM chooses the principle of structural risk minimizationinstead of the principle of empirical risk minimization. This makes the SVM has abetter ability of generalization. In addition, the SVM ad opts the idea of Kernelfunction. It solves the difficulties in reality by transforming the nonlinear function,corresponding the actual problems to the high-dimensional space, re-constructing thefunction to adapt to the higher dimensional space.In this paper, we focus on the analysis of cancer gene expression with theapplication of the SVM-RFE algorithm. The SVM-RFE is a combination of supportvector machine (SVM) and recursive feature elimination. It belongs to backwardsearch. It reduces the dimension of the space by eliminating the unnecessary features. We apply the SVM-RFE algorithm to the analysis of tumor gene expression data, tosee the advantages and characteristics of the algorithm in dealing with the massivedata of Medical.In this paper, we apply the SVM-RFE algorithm to the analysis of the gastriccancer gene. First, we process the data with T-test method to remove the irrelevantinformation and extract the genetic information, then we use the SVM-RFE algorithmfor dealing with the genetic data. The results of experiments show that the accuracy ofprediction is higher and support vector machine with SVM-RFE feature selectionalgorithm can get a higher accuracy and better sensitivity. At the same time, thefeature genes we selected can also provide a reference to the research of disease.
Keywords/Search Tags:Support vector machine, Kernel function, Cancer gene expression data, Recursive feature elimination, Gene selection
PDF Full Text Request
Related items