Microarray Data Clustering Algorithm

Posted on:2007-11-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y Ma

Full Text:PDF

GTID:2208360182994898

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

As the Human Genome Project developing, the research on the genes' function and every gene in genome goes to in-depth gradually. Analyzing the expression of genes in different times and conditions is the main approach to find out functions of genes. cDNA micro-array technique is an important tool for biologists to know about gene. Lots of gene expression data generated from micro-array technique, it's essential to adopt data mining technique to extract valuable information from these data.Genes with similar function have similar expression. The unknown genes' functions can be forecasted by analyzing genes with similar expression. Clustering algorithm which partitions data according to their similarity realizes that things of one kind come together. Gene expression data are dealt with clustering technique, genes with similar expression can be clustered into the same group. It's helpful for biologists to find out gene function and inheritance pattern.Most of the clustering algorithms which have been imported into analyze the gene expression data origin from non-biological fields. There exists some shortcoming in the application. For example K-means and Self organize maps need user input the number of clusters which was hard to been estimated before the clustering process, the final result will be influenced seriously when changed the parameter. Many traditional clustering algorithms are sensitive to noise data like hierachical clustering. In the end, the traditional algorithms origin from non-biological fields, so the clusters don't include precise biological meanings. For addressing these shortcoming The K nearest neighbors absorbed firstly idea and some knew biological meanings are introduced into the algorithm which based density, a novel K nearest neighbors absorbed firstly clustering algorithm is devised and implemente in this paper. And this algorithm was proposed to analyze a yeast cell cycle dataset. Comparing the results of K nearest neighbors absorbed firstly clustering algorithm and k-means shows K nearest neighbors absorbed firstly clustering algorithm provides more useful information than K-means, whether in the structure of clusters or biological meanings.

Keywords/Search Tags:

Microarray, gene expression data, clustering, K nearest neighbor, based on density

PDF Full Text Request

Related items

1	Research On Relevant Problems Of DNA Microarray Expression Data Analysis
2	Comparison of clustering algorithms for gene expression microarray data
3	Association Rules Mining And Its Applications In Microarray Gene Expression Data
4	Clustering algorithms for time series gene expression in microarray data
5	Clustering Analysis Based On The Ant System For Gene Expression Data
6	Gaussian Mixture Model-based Clustering Analysis For Gene Microarray Expression Data
7	K-means clustering with automatic determination of K using a Multiobjective Genetic Algorithm with applications to microarray gene expression data
8	The Research And Application Of Clustering Algorithm Based On Density
9	Research Of Density Peak Clustering Algorithm Based On K-nearest Neighbor Optimization
10	Research On Density Peaks Clustering Algorithm Based On Nearest-Neighbor Optimization