Sampling the top-k representative data for classification

Posted on:2006-08-08

Degree:M.Sc

Type:Thesis

University:Simon Fraser University (Canada)

Candidate:Wang, Ping

Full Text:PDF

GTID:2458390005997663

Subject:Computer Science

Abstract/Summary:

Building classification models based on databases is an exciting area in data mining research. In many classification tasks, only a small set of labelled training data are given. These data are not sufficient for a good classification. We need to sample and label more data as training data for better performance. However, labelling data is time-consuming and costly. The challenge is to effectively select the most representative data for labelling.; While most active learning methods for this problem follow the incremental query learning paradigm in which the classifier is retained upon each newly labelled query, we present a distance-based method which samples the top-k representative data simultaneously and can be applied to any distance-based classifiers. Redundancy reduction makes classifier retraining unnecessary and makes it find more balanced examples with regard to class distribution in database. Experiment results from two data sets and two classifiers demonstrate the advantages of our method.

Keywords/Search Tags:

Top-k representative data, Classification

Related items

1	Enriching The Representative Of Document Using IRF
2	Enriching The Representative Of Document Using Irf
3	Symbolic Data Classification And Active Learning Based On Coverage Reduction
4	Research On Representative Skyline Algorithm Based On Large-scale Data Sets
5	The Design And Implementation Of Medical Representative Management System Based On Front And Rear Platform Separation Technology
6	Classification, Content-based Sports Programs
7	The Studies Of Model Report
8	Representative Image Selection In Large-scale Image Dataset
9	Research Of Chinese Page Automatic Classification Based On Representive Samples
10	Research On Photo Classification And Re-finding Technology For Mobile Devices