Font Size: a A A

Clustering Analysis Of Cancer Gene Expression Data Based On Manifold Learning

Posted on:2014-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:C L WangFull Text:PDF
GTID:2268330392464444Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Cancer gene expression data has the characteristics of high dimensionalities and small samples, it is necessary to analyze and handle the data, if you want to dig out the useful information from such complex gene expression data. Many dimensionality reduction and clustering methods have applied to cancer gene expression data, and it finds the useful information which is applied in the treatment of diseases and early diagnosis. According to the characteristics of the cancer gene expression data, this paper proposed a clustering analysis method based on manifold learning. The method combines the manifold learning and clustering analysis method, and realizes the visualization of cancer gene expression data’s dimensionality reduction, and gets good clustering results at last.Firstly, this article introduces the basic theory on manifold learning and gene clustering, several representative manifold learning algorithms are analyzed, basic principles and steps of the algorithms are elaborated, as well as the advantages and disadvantages of each algorithm. Then it introduces the application of clustering algorithm in the analysis of gene expression data, and lists several common gene expression data clustering algorithms.Secondly, an improved distance and multiple weights locally linear embedding algorism is introduced in this article. Because of the characteristics of uneven distribution of cancer gene expression data, the algorithm uses a new distance instead of the Euclidean distance of LLE algorithm for solving the nearest neighbor points; it constructs a linear structure for using several linearly independent set of weights and obtains better embedding results.Thirdly, a cancer gene expression data clustering analysis method based on manifold learning is proposed in this paper. Through analyzing the manifold distribution characteristics of the cancer gene expression data, it combines the manifold learning and clustering method, elaborates the intrinsic dimensionality and achieves visualization, then makes the clustering analysis in accordance with the low-dimensional structure of the data.Finally, cancer gene expression data clustering analysis method based on manifold learning is applied to sets of two cancer data in this paper. And experimental simulation has been done by Matlab, then we evaluated and analyzed the results of the experiment.
Keywords/Search Tags:Cancer gene expression data, Gene clustering, Manifold learning, Locallylinear embedding(LLE), Reconstruction weights
PDF Full Text Request
Related items