Font Size: a A A

Cluser Analysis And Its Application In Gene Expression Data

Posted on:2005-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q S DengFull Text:PDF
GTID:2168360152469230Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Gene microarray technique makes it possible to observe thousands of genes simultaneously. Currently, cluster methods are used most frequently among the methods applied to the analysis of gene expression data. There are lots of cluster methods applied to the analysis of gene expression data. The problems relating to cluster include data preprocessing, similarity measure and cluster validity. Considering the specialty of gene expression data and the characteristic of some cluster algorithms, two cluster models of gene expression data analysis are introduced.One model is fuzzy cluster analysis of gene expression data based on a cluster validity measure named Xie-Beni index. The model used Xie-Beni index, a validity measure applicable to fuzzy cluster, to measure the validity of different cluster results under the same cluster number and the results under different cluster numbers. We applied the model to analyze the expression data set of leukaemia. The experimental result proved that this model can get cluster numbers automatically and a high accuracy of classification.The other is cluster analysis of gene expression data associating SOM with k-means based on a cluster validity named Silhouette index. the clustering boundaries of nodes are not clear in the SOM results, this model applied k-means clustering to the results of SOM results. Besides, the clustering results are not consistent each time due to the influence of the initial value of nodes and learning order of samples. This model applied Silhouette index, a validity measure applicable to hard cluster, to measure the validity of different clustering results. We applied the model to the analysis of gene expression data of leukaemia and colon. Good experimental results were gained.
Keywords/Search Tags:Cluster, Fuzzy C-means, Self-Organizing map, Cluster validity, Bioinformatics
PDF Full Text Request
Related items