Research On Tumor Classification Based On Gene Expression Profile Data

Posted on:2017-10-17

Degree:Master

Type:Thesis

Country:China

Candidate:M Shu

Full Text:PDF

GTID:2404330488979872

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Cancer is one of the diseases that threaten human life.In recent years,the number of people who lost their lives due to cancer is rising.Its mortality rate has been far more than other diseases,how to treat cancer,especially in malignant tumor field,has become the focus of human research.However,tumors generally have a variety of similar subtypes,the correct classification and diagnosis of tumors is very meaningful for making a target treatment.With the successful completion of the human genome project,and the advent of the post genomic era,gene chip technology was born.Gene chip technology can detect the expression levels of thousands of genes at the same time and the expression values of these genes make up the data of gene expression profile.However,due to various subjective or objective factors,gene expression profiling data usually contain a large amount of noise,redundancy,non correlation and outliers.Besides,it also has a nonlinear,high dimension and small sample characteristics,which makes people face a huge challenge in the data processing and analysis.Therefore,how to select the relevant information from the gene expression profile data and make precise classification are the focus of this research,the main work of this paper includes the following two points:This paper proposes a feature selection method based on RFE ReliefF and SVM,this method mainly includes the two dimension reduction process of tumor gene expression profile data set,The first dimension reduction process uses the ReliefF algorithm to solve the weight value of each feature.Then the characteristic of the weight which is less than the threshold is eliminated by setting an adaptive threshold,finally,the effect of the initial dimension reduction is achieved.But ReliefF algorithm can only eliminate the irrelevant features,it can not reduce the redundant features,so the SVM RFE is introduced in the second dimension reduction,The second dimension reduction process uses RFE SVM algorithm to rank each feature,then one or more features are eliminated,and the optimal subset is found through multiple iterations.Through the combination of these two methods,the new feature selection method is obtained which can eliminate the noise,redundant and irrelevant features of the data set so as to find out the characteristics associated with the tumor classification,improve the classification accuracy and reduce the workload.Because the classification accuracy of traditional classification method is not high and it is prone to be over fitting in dealing with high dimensional small sample data.A classification method based on improved sparse representation is proposed in this paper,the method consists of three stages,the method that was used in first and second stages are the same as the previous.New data sets from the first two stages are used as the third stage of the new input data,In the third stage,we introduce the sparse representation classifier to solve the sparse coefficients,and then reconstruct the error according to the sparse coefficients.Finally Classification will be made which based on reconstruction error.In the following contrast experiment results,we can see that the method can get better performance in some data sets.

Keywords/Search Tags:

Gene expression profile, Feature selection, Tumor classification, ReliefF algorithm, SVM RFE algorithm, Sparse representation algorithm

PDF Full Text Request

Related items

1	Research On Tumor Classification Algorithm Based On Sparse Representation
2	Research On Classification Algorithm Of Tumor Gene Expression Profile Based On Dictionary Learning
3	The Analysis Of Tumor Gene Expression Profile Data Based On Hybrid Feature Selection Algorithm
4	Research On Classification Algorithm Based On Tumor Gene Expression Profile Data
5	Research On High-dimensional Biomedical Feature Selection Algorithm Based On Intelligent Algorithm
6	Research On The Algorithm Of Gene Feature Selection Based On Classification Technology
7	Research On Feature Selection Algorithm Based On Breast Cancer Gene Expression Data
8	The Research On Gene Selection Based Shrinkage Feature Selection Algorithm For Cancer Classification
9	Research On Tumor Classification Algorithm Based On Gene Expression Data
10	Study Of Gene Selection Algorithm For Multi-category Tumor Classification