Algorithm Of Gene Expression Data Classification And Its Application

Posted on:2006-10-07

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Shen

Full Text:PDF

GTID:2168360155961993

Subject:Computer application technology

Abstract/Summary:

With the development of the bioinformatics, how to analyzes complex genomics data using machine learning approach has become an important research field. Gene expression data provided by microarray technologies can express gene expression modes under any given conditions. And they also help us make deep research into biological processes in essential, such as gene function, tumor, senility and drug. In this thesis we mainly discussed tumor classification and gene function classification methods using gene expression data, and we have also proposed some improvements of these algorithms and methods.This thesis improves tumor classification using gene expression data method in two aspects: classification algorithm and feature selection. We combined SVM with kNN, based on taking SVM as a 1 NN classifier in which only one representative point is selected for each class. In the case of testing samples, the algorithm computes the distance from the test sample to the optimal super plane of SVM in feature space, and choices the classification algorithm according to the distance. Experiments results show that the new algorithm can improve the classification accuracy than the old ones. The gene expression data set is always "few samples, high dimensionality". To solve this problem this thesis improves the classification accuracy using feature selection method. We have proposed a new recursive feature elimination method - Correlativity-based RFE. This new method searches for the minimum redundancy as well as avoids deleting the genes that most dominate the target phenotypes by calculating correlativity between genes. Experiments results show that higher classification accuracy is achieved by using the new feature selection approach, and the feature selection process costs less time.In the case of gene function classification using gene expression data, this thesis presents tow algorithms: confidence adjustment algorithm based on gene function tree (tCAA) and dominate factor decision algorithm based on gene function tree (tDA), according to the subjection relationship of function classes. According to these two algorithms, this thesis proposed a new gene function classification algorithm based on gene function tree. In the test phase, the algorithm automatically detects gene function confidence which is too high or is ignored, and then it adjusts the confidence according to tCAA. The new algorithm introduces tDA to avoid the limitation of fixed-size prediction. It employs dominate factor to decide the...

Keywords/Search Tags:

Gene expression data, Tumor classification, Feature selection, Function classification, Function tree

Related items

1	Selected Based On The Gene Expression Profiles Of Tumor Characteristic Gene Studies
2	The Research On Gene Expression Profiles Data For Tumor Classification
3	A Study Of Tumor Classification Algorithms Using Gene Expression Data
4	Tumor Classification Based On Gene Expression Studies
5	The Application Research Of Support Vector Machine In Non-spherical Distribution Data Set And Tumor Gene
6	The Application Of Feature Selection In Gene Expression Data Analysis
7	Study On Feature Selection Method For Classification Of Gene Expression Data
8	The Study Of Tumor Classification Methods Based On Gene Expression Data
9	Data Analysis Of Cancer Gene Expression Based On SVM-RFE Algorithm
10	The Research On Gene Expression Profile Data Mining Method Based On Sparse Representation