Font Size: a A A

Research Of Instance Selection Algorithm Based On Nearest Neighbor Classifier

Posted on:2010-12-30Degree:MasterType:Thesis
Country:ChinaCandidate:N ZhangFull Text:PDF
GTID:2178360302461987Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Nearest neighbor classifier is one of the most important machine learning algorithms. However, it is computationally expensive and the storage requirement is large. So instance selection for nearest neighbor classifier is one of the research focuses. Besides, the existed instance selection algorithms for nearest neighbor classifier excute on labeled data set, and this method requries lots of manual labor and resource when instances being labeled, instance selection from unlabeled data set is one feasible way to solve this problem.In order to decrease computation and storage requirement that nearest neighbor classifier needs, we propose an instance selection algorithm based on contribution. The algorithm selects instances from labeled data set according to their contribution to classification, and it allows trainning error exists to improve generalization ability.In order to decrease labeling cost, we introduce maximum entropy into instance selection, and propose an instance selection algorithm based on maximum entropy. By computing information entropy of candidate instances and selecting instances with maximum entropy from unlabeled instances, we can get the most improtant instances for classification and ask for expert to label them. Experiments on both artifical and real data sets demonstrate the effectiveness of this algorthm.
Keywords/Search Tags:Nearest neighbor classifier, Instance selection, Contribution, Maximum entropy, Labeling cost
PDF Full Text Request
Related items