Font Size: a A A

Algorithms For Clustering Gene Expression Data Based On Spanning Tree

Posted on:2007-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z G ZhangFull Text:PDF
GTID:2120360185977526Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The Invention of GeneChips allows us to study simultaneous variations of genes at the genome-wide scale. Clustering analysis is the preferred method to analysis the gene expression data. Clustering analysis is the art of finding groups in a given data set such that objects in the same group are similar to each other while objects in different groups are dissimilar. With the explosion of the gene expression data, how to use the analysis technologies in computer science to analysis the data and discover useful and instructive knowledge for biological experiment is attracting more and more attentions for the bioinformaticians.Through a massive analyses and research, we find the cutting long edges algorithm and the "best representative point" globally optimal clustering algorithm for clustering gene expression data based on minimum spanning tree, which put forward by Dong Xu and Victor Olman, could be improved in order to short runtime. Based on them, we put forward direct clustering algorithm, local optimal clustering algorithm and maximum spanning tree fuzzy clustering algorithm. These new algorithm mainly adopt direct sort and recursion computational instruments. They can simplify computational procedure, advance program run efficiency and short runtime. According to the results of the experiments, discussion and evaluation are shown for the algorithm. It is proved that optimum clusters can be obtained and our algorithms can reach linear runtime. At the same time, this paper introduces the developing software system MST-Cluster design to clustering gene expression data, this system could classify the inputting gene data according to specified algorithm, and also can compare the two classified groups of gene.This paper has studied algorithms for clustering gene expression data based on minimum spanning tree and put forward two improved clustering algorithms. The first chapter gives an overview of the present situation that we study; The second chapter introduces related definition and formulae; In the third chapter, we essentially study algorithms for clustering gene expression data based on minimum spanning tree and improve them, then give the simulation experiment results through gene database. In the forth chapter, we develop a software system MST-Cluster design to clustering gene expression data. The fifth chapter gives summarize and put forward a blueprint.
Keywords/Search Tags:clustering, gene expression data, minimum spanning tree, fuzzy clustering
PDF Full Text Request
Related items