Font Size: a A A

Research And Application Of Clustering Algorithms For Biological Data

Posted on:2009-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360272457280Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of DNA microarray technology, huge amounts of gene expression data have been generated. How to analyze and handle these data, digging out valuable biological knowledge, has become a hotspot in the research of post-genomic age. Cluster analysis is a major exploratory technique to group genes with related functions according to the similarities in their expression profiles. It is helpful to understand gene function, gene regulation, cellular processes, and subtypes of cells. Nowadays, many clustering algorithms have been applied in the analysis of gene expression data and gain plenty achievement. However, many problems also arise in the application process. In this thesis, we do our work around the research and application of clustering algorithms on gene expression data.Firstly we introduce Bioinformatics and DNA microarray technology successively to lead into gene expression data. Secondly we present the basic knowledge of analysis on gene expression data and bring on some traditional clustering algorithms, including Hierachical Clustering in four forms, K-means Clustering, and Self-Organizing Maps Clustering algorithm. At the end of the chapter, we exhibit four gene expression data sets which have external criteria in biology. Thirdly we investigate the gene clustering algorithms based on Swarm Intelligent deeply: firstly expound the development of clustering algorithms on gene expression data in recent years, secondly introduce the GAK algorithm, then we propose a novel clustering algorithm for gene expression data,the clustering algorithm based on QPSO, and experiment this algorithm using the data sets mentioned before. The experiment results show that the new clustering algorithm has a good performance. Then we study the external assessment and parameter selection. After introduce the rand index we research the selections of similarity metric and data transformation of the clustering algorithms. In the end, we introduce the internal validation technology of the clustering result and verify the results of algorithms mention in the thesis.
Keywords/Search Tags:DNA microarray, gene expression data, Clustering Algorithm, QPSO, Cluster Validation
PDF Full Text Request
Related items