Font Size: a A A

K-means clustering with automatic determination of K using a Multiobjective Genetic Algorithm with applications to microarray gene expression data

Posted on:2016-04-06Degree:M.SType:Thesis
University:San Diego State UniversityCandidate:Shaw, Matthew Karl EllisFull Text:PDF
GTID:2478390017476475Subject:Computer Science
Abstract/Summary:
As the role of large scale data analysis continues to expand, the task of extracting useful information becomes ever more important. Clustering, the task of grouping together data points that share similar features, provides a way to present high-dimensional data in a format that can be more easily comprehended by humans, while also allowing inferences to be drawn about previously unseen data points based on the known characteristics of other points in the same cluster. Over the years several techniques have been developed for clustering data. While effective, most of these algorithms require that the number of clusters to partition the data into be known in advance. In situations where the domain is well understood, this requirement is a minimal burden, but this becomes problematic when little is known about the data being analyzed. The work that follows investigates using a Multiobjective Genetic Algorithm to discover an optimal number of clusters to partition the data into while simultaneously finding high quality clustering solutions. This algorithm is then applied to clustering microarray gene expression data.
Keywords/Search Tags:Microarray gene expression data, Clustering, Multiobjective genetic algorithm, Partition the data into
Related items