Feature cluster selection for high-dimensional data analysis

Posted on:2008-12-18

Degree:M.S

Type:Thesis

University:State University of New York at Binghamton

Candidate:Li, Hao

Full Text:PDF

GTID:2448390005451944

Subject:Computer Science

Abstract/Summary:

This thesis address the gaps between traditional data mining tasks, feature selection and clustering, and the knowledge desired by domain experts in real-world applications. It illustrates two particular gaps using microarray data analysis: the gap between a near-optimal feature subset and a candidate set of interesting features, and the gap between good clusters and relevant clusters. This thesis proposes to bridge such gaps by a new data mining task, feature cluster selection, which aims to select and group all relevant features in a data set into a small number of coherent clusters. It provides both formal definition and empirical formulation for the new problem, and describes an efficient solution based on Max-relevance, Max-cohesion, and Min-separation criteria. Experiments on microarray data verify that the solution can discover relevant feature clusters of statistical significance as well as select representative feature subsets of high accuracy.

Keywords/Search Tags:

Feature, Data, Selection, Clusters

Related items

1	Research Of Feature Selection In Clusters Analysis
2	Localized feature selection for unsupervised learning
3	Researches On Feature Selection Methods For Functional Data
4	Feature Selection Mechanism For Multimodal Social Media Data With Privacy Protection
5	Feature selection focused within error clusters
6	The Research On Causal Feature Selection Algorithm Based On AD-tree
7	Research On New Feature Selection Algorithm
8	Research On Dynamic Feature Selection Algorithm For Flow Features
9	The Research On Feature Selection Algorithms Based On Information Theory
10	Research About Feature Selection And Classification For Interactive Feature Of High-dimensinal Data