Font Size: a A A

Within-class and unsupervised clustering improve accuracy and extract local structure for supervised classification

Posted on:2007-05-19Degree:Ph.DType:Dissertation
University:Rutgers The State University of New Jersey - New BrunswickCandidate:Fradkin, DmitriyFull Text:PDF
GTID:1448390005966852Subject:Computer Science
Abstract/Summary:
Deterministic clustering methods at different levels of granularity such as within classes, at the class level and across classes are investigated for their effect on classification performance in a series of empirical studies. Specifically I have found that clustering within classes, by extracting local structure, can improve supervised learning performance in many cases. This approach, and unsupervised clustering across entire data sets are usually better for prediction than clustering or grouping within-class clusters across classes, or even global classifier approaches. These conclusions are supported by more than 3000 experiments on four benchmark datasets, using some of the most powerful automated classification methods such as regularized logistic regression and Support Vector Machines (SVMs).; I have used a simple combination of unsupervised clustering and classification methods to search for local structure in the data and detect locally significant features. The approach is illustrated by analysis of lung cancer survival data from records of 200,000 patients.
Keywords/Search Tags:Clustering, Local structure, Classification, Classes
Related items