Study On SVMs-based Classification Of Gene Expression Data

Posted on:2007-07-15

Degree:Master

Type:Thesis

Country:China

Candidate:C Zhan

Full Text:PDF

GTID:2178360182480913

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The new molecular biological technology, gene microarray technology is a great science and technology achievement with deep influence, its emergence will bring a revolution to bio-informatics, provide a important method for research of bio-informatics. Gene microarray makes it feasible to obtain large number of gene expression data. These gene expression data can express gene expression modes under any given conditions. And they also help us make deep research into biological processes in essential.Support vector machines(SVMs) is a new kind of machine learning method based on statistical learning theory, which has many advantages. SVMs solve small-sample problems by using structural risk minimization(SRM) to take the place of empirical risk minimization(ERM).Moreover, nonlinear problems are changed into linear ones by using mapping the low dimension original space to high dimension feature space, and employing kernel function, which make the algorithm be realized easily. Because of such advantage, SVMs become a hot spot of machine learning theory, and are applied successfully in many areas. The gene microarray expression data with high dimensionality, few samples and nonlinear characteristics, which is a new challenge for some traditional machine learning methods, their data analysis has become the focus research of biological informatics.Through support vector machine algorithms for gene expression data classification training, SVMs provide a effective way for analysis of gene expression data. This paper focused on support vector machine classification algorithm based on gene expression data, and proposed some improvements to the algorithm according to the existing problems in those algorithms and models. This thesis improves classification using gene expression data method in two aspects: feature selection and SVMs classification algorithm.The gene expression data set is always "few samples, high dimensionality". To solve this problem, this thesis improves the classification accuracy by using feature selection method. We have proposed a new recursive feature elimination method: correlativity-based RFE. This new method searches for the minimum redundancy as well as avoids deleting the genes that most dominate the target phenotypes by calculating correlativity between genes. Higher classification accuracy is achieved by using the new feature selection approach, and the feature selection process costs less time. We make some appropriate improvements of sequential minimaloptimization(SMO) algorithm to improve the classification accuracy and training speed according to the analysis of the traditional algorithm. The algorithm used radial base kernel function, optimize support vector machine classification performance by adjusting parameters. Experiments results show that the new algorithm can improve the classification accuracy than the traditional algorithm.

Keywords/Search Tags:

Bioinformatics, Gene Expression Data, Statistical Learning Theory, Support Vector Machine, Feature Selection

PDF Full Text Request

Related items

1	Support Vector Machine And Its Application In Gene Expression Data
2	Tumor Classification Based On Gene Expression Studies
3	SVM Based Research On Feature Selection Method For Gene Expression Data
4	Study On Least Squares Support Vector Machine And Its Applications
5	Data Analysis Of Cancer Gene Expression Based On SVM-RFE Algorithm
6	Mining Method Based On Gene Expression Profiling Data
7	Gene Selection And Cancer Classification Based On Optimization Algorithm And Support Vector Machine
8	Research On Feature Selection Algorithm Base On Gene Expression Data
9	The Application Research Of Support Vector Machine In Non-spherical Distribution Data Set And Tumor Gene
10	Support Vector Machine With Input Uncertainty And Its Application To Bioinformatics