Font Size: a A A

The Research And Application On Gene Expression By Clustering Algorithms

Posted on:2014-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:W LiFull Text:PDF
GTID:2298330434950839Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Abstract:The rapid increase in gene expression data resulted in an urgent need at this stage an effective tool to analyze gene expression data and mining the implicit biological information. Cluster analysis algorithm has been the most widely used powerful tool for data mining, and in recent years lots of cluster analysis algorithms were suggested. However cluster analysis algorithm differs from each other in efficiency in the analysis of gene expression data because of its different principles. How to effectively analyze a variety of cluster analysis algorithms, and develop new and more effective clustering analysis algorithm will become a priority.Several theories were analyzed and summarized on the basis of k-means clustering algorithm, self-organizing map clustering algorithm, the silhouette index theory. Taking into account the defects of the original k-means clustering algorithm, an enhanced k-means clustering algorithm with an optimization of the initial cluster centers was designed and implemented using the Matlab language. At the same time for the self-organizing map clustering owning shortcomings, a combination of k-means clustering algorithm and silhouette index of self-organizing map clustering algorithm was designed using Matlab language. It was followed by the achievement of the three clustering analysis algorithm applied with three sets of real gene expression data, tabulated experimental results, and comparative analysis of the pros and cons of the clustering effect of the three types of clustering algorithm.In this paper, the k-means clustering algorithm and self-organizing map clustering algorithm were optimized on the previous theoretical results. The optimizations were proved effective through the analysis of several groups of experimental results.
Keywords/Search Tags:gene expression data, the silhouette index, the k-meansclustering, the self-organizing map clustering, cluster analysis algorithm
PDF Full Text Request
Related items