Font Size: a A A

Tumor Gene Identification Study On Support Vector Machine Classification Model

Posted on:2014-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:A L HaoFull Text:PDF
GTID:2248330395997186Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Invented in the late20th century, SVM(Support Vector Machine) is used tosolve classification and regression problems in the data mining field, for newmethods and technology. In a dozen years, the theoretical and applied research havemade unexpected progress and leap breakthrough.This paper is mainly studied on the SVM model theory, and application in thepractical engineering problems in biomedical field, such as classification andidentification of tumor or cancer cells, etc. The original gene expression profilingdata can not get a desired result with any kind of a classifier directly on classificationidentification. Therefore, there are two important steps to identify tumor or cancercells:genes selection and classification. Before identifying tumor with classifier, theoriginal data preprocess must be carried on, which is called feature gene selectionprocess. Feature gene selection process is usually divided into two steps-removingirrelevant genes and eliminating the redundant genes.Feature gene selection will cause a certain impact on the final classificationeffect. Therefore, in the process of selecting feature gene selection, informationindex to classification is used to remove the unrelated gene. In the process ofremoving the redundant genes, there are many kinds of methods. Methods fordifferent tumor samples may produce different results. This paper adopts a newalgorithm to remove the redundant genes, which is called the correlation coefficientof redundancy elimination. After feature gene selection classifier is used to classifysample sets, which distinguish the main tumor samples and normal samples. A goodclassifier will identify the tumor or cancer samples more accurately, and providetheoretical basis for future clinical research more powerful.Therefore, SVM should be used as the final classifier for the small sampledimension and high characteristic of the tumor gene expression profiles. Comparedwith other classifiers the biggest advantage of SVM is very few number of samples and high dimension data set exhibiting excellent performance of machine learning.That’s why other classifier SVM has been sought after by most researchers in theshort several years. This paper is about proposing an improved rough SVM model toclassifying tumor samples and normal samples. The final comparative experimentsshow that the redundancy algorithm and the improved classifier model are effective.
Keywords/Search Tags:Support Vector Machines, Gene expression profile, Tumor identification, Improved Rough Support Vector Machine
PDF Full Text Request
Related items