Font Size: a A A

Research On Tumor Classification Algorithm Based On Sparse Representation

Posted on:2017-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:X N HongFull Text:PDF
GTID:2404330488971868Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Up to now,cancer has always been one of the most deadly diseases.Thanks to the rapid development of gene chip technology,a large amounts of cancer gene expression data have been gained and used for reseach.Using gene expression data to diagnose the disease has become a hot topic of the post genome era.The accuracy of Gene expression data classification will help to improve the efficiency of diagnosis.However,gene expression data generally are sample sized,high dimensional,and non-linear.In view of the basic characteristics of gene expression data,this paper uses the feature selection and classification method based on sparse representation to deal with gene expression data in section 3.In this section,a new method of feature selection based on sparse representation is carried out to reduce the dimension and detect the redundancy.At the first step of the proposed method,correlations between genes and categories will be computed by sparse representation,and a ranking of this correlations is to be done for the primary feature selection.Then we will group the selected feature subset,and iteratively detect redundancy for every group.In each group,we will do a classification prediction after the redundancy detection has been done to ensure the classification ability of the obtained subset.The whole process continues to all groups are calculated,i.e.,to obtain a final subset.This method can guarantee the maximum eliminating redundant features,as well as ensure that the classification ability of the finally subset is in a rising trend.Aiming that the time complexity of sparse representation algorithms is normally very large,this paper designs a new classification algorithm based on metasample and dictionary pair learning in section 4,which is called MDPLC.The proposed method can be divided into two steps.Firstly,we will extract the metasamples of each class by singular value decomposition(SVD).Then,the dictionary is decomposed into a comprehensive dictionary and an analytical dictionary,and an alternative iterative method is proposed to solve the optimal coefficients.This algorithm not only can solve the coefficient coding fastly as well as ensure classification accuracy,but also has a excellent generalization ability and algorithm stability.On the proposed public datasets,the proposed algorithm has a better classification performance compared with several other methods based on sparse representation classification that has been proposed in other papers.
Keywords/Search Tags:Gene expression data, feature selection, redundant detection, sparse representation, MDPLC, metasample
PDF Full Text Request
Related items