Font Size: a A A

The Study Of Tumor Classification Methods Based On Gene Expression Data

Posted on:2013-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:P YuFull Text:PDF
GTID:2248330362973766Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
In today’s society, as people’s living standards increased rapidly and living habits,dietary habits changed. Cancer incidence and mortality rise rapidly and has become amajor disease which threats to human health. Early diagnosis and treatment of tumors isthe key to reducing cancer mortality.The examination methods about tumors include the invasive and noninvasive inclinical. Either invasive or noninvasive tests, when making a diagnosis that isclassification, they need to be according to test results. However, the currentclassification of tumors is highly dependent on the pathologists’ subjective judgmentsabout the tumor tissue, if only relying on their observation of diagnosis it is difficult toavoid different experience, fatigue negligence and other human factors, in this casemisdiagnosis often happens. The tumor classification system based on gene expressiondata can avoid misjudgment caused by human’s subjective factors, witch is Basedentirely on objective data to make an objective assessment. Therefore, a classificationsystem of tumors based on gene expression data which has fast computing speed andhigh accuracy is the greatest difficulties we are confronting with currently.The gene expression data show the feature of a small sample and high dimension,in addition,it includes complex noise because of human factors and environmentalchanges.If classifying them directly,the error is too large,so we must reduce theirdimensions with effective dimension reduction algorithm before classifying. This paperanalyzes the traditional dimensionality reduction algorithms (such as PCA, LDA, LPP,NPE, etc.), while introduces with sub-space graph embedding dimensionality reductionalgorithm and places the expansion of these dimensionality reduction algorithms intograph embedding framework. But these methods must decompose dense matrix, thatmakes the computing time and cost of physical memory rise rapidly, otherwise thecorrect classification rate doesn’t increase. To overcoming these shortcomings, thispaper makes use of Spectral Regression analysis algorithm to reduce the dimension ofgene expression data. Classification is the ultimate goal of dimensionalityreduction.After experiencing on these classification algorithm, moreover comparingwith the results,analyzing them,it propose Kernel Space K-Nearest Neighbor on thebase of K-Nearest Neighbor.Drawing the advantage of Support Vector Machines andcombining with Spectral Regression analysis algorithm,this paper proposes the classification system of Kernel Space K-Nearest Neighbor—Support Vector Machinesfor gene expression data.For example,after dimensionality reduction on datasets4_Tumors with Spectral Regression,then classify them with K-Nearest Neighbor andKernel Space K-Nearest Neighbor, in the case of selecting12training samples of eachcalss,their recognition rates were88.98%and91.01%.It can save physical costs andcomputing time, provide decision support for clinical diagnosis and treatment at thesame time.
Keywords/Search Tags:Gene Expression Data, The Methed of Classification, Spectral RegressionAnalysis, Kernel Space K-Nearest Neighbor, Kernel Space K-NearestNeighbor—Support Vector Machines
PDF Full Text Request
Related items