Font Size: a A A

Active Learning For Cost-sensitive Classification

Posted on:2017-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:S Y ZhouFull Text:PDF
GTID:2428330590968276Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Active learning is an important research domain in machine learning community.Active learning aims to selectively label the most informative examples to train a high quality model,thereby reducing the data collection cost.So far,active learning research mainly focus on balanced classification problems.But in many real-world applications,data distribution is highly skewed and we need to consider the different costs caused by different misclassification errors.This is the so-called cost-sensitive classification problem.Active learning and cost-sensitive classification both have broad applications in practice.However,there is still limited research on active learning methods for cost-sensitive classification problems.In this paper,we investigate the issue of active learning for cost-sensitive classification.We first propose a general active learning framework based on generalization error optimization.By incorporating misclassification cost into sampling function,this framework is applicable to cost-sensitive classification problems with high flexibility to different base learners.Then we apply this framework to logistic regression model and naive Bayes model respectively by deriving corresponding model estimation methods,thereby proposing detailed active learning algorithms for cost-sensitive classification using the two models as the base learner.We test the proposed algorithms on various real-world data sets and compare them against some well-known existing algorithms.Extensive experimental results demonstrate that the proposed active learning algorithms are highly effective in choosing the most informative examples for cost-sensitive classifiers,and significantly outperform many state-of-art methods.
Keywords/Search Tags:machine learning, active learning, cost-sensitive classification, generalization error
PDF Full Text Request
Related items