Font Size: a A A

Study On Data Classification Method Of Manifold Learning And Multi-task Learning Based On Gene Expression Of Tumor

Posted on:2016-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:B B TianFull Text:PDF
GTID:2298330467991307Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of bioinformatics, the methods of feature extraction ontumor gene expression data has been widely used in tumor classification and predictionand played an important role in the clinical diagnosis and treatment. But the tumorgene expression data with high dimension and small sample size and othercharacteristics, the traditional machine learning algorithms are difficult to accuratelyclassify. This thesis study tumor classification method for gene expression data fromthe manifold learning and multi task learning method.Locally Linear Representation Fisher Criterion (LLRFC) is applied to theclassification of tumor gene expression data. It based on locally linear embeddingwithout the introduction of sample category information disadvantage, construction ofsub sub graph and the between class using the K nearest neighbor standard and datalabels. The locally linear reconstruction calculation of intra class and inter classdiagram graph of the weight, and the establishment of Fisher standards to achieve subgraph and the between class scatter graph of minimum divergence, the bestclassification feature extraction in feature subspace. Through the different geneexpression data classification experiments show the effect of LLRFC algorithm.The Multi-task Sparse (Mt-SP) is applied to the classification of tumor geneexpression data. It integrate with sparse projection and multi task learningcharacteristics, on the one hand the relationship between various tasks on the main taskof learning, to get the classification results compared to single task learning better andstronger generalization ability; on the other hand the characteristics of learning abilityto determine more tumor gene expression data by sparse selection. The algorithm isapplied to the insurance company benchmark data set (COIL2000) and small roundblue cell tumor data set (SRBCTs) to test the effect of algorithm.
Keywords/Search Tags:tumor gene expression data, sparse projection, multi-task learning, manifold learning
PDF Full Text Request
Related items