Font Size: a A A

Research On Ensemble KNN Classifier Based On Metric-learning Method

Posted on:2010-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:F YuFull Text:PDF
GTID:2178360302460569Subject:Measuring instruments
Abstract/Summary:PDF Full Text Request
Recent years, a huge attention has been focused on the Data Mining in the science and information industry. As more data are gathered, with the amount of data doubling every year, data mining is becoming an increasingly important tool to transform these data into useful information. It is commonly used in a wide range of practices, such as marketing, surveillance, fraud detection and scientific discovery.This paper focuses on one of the branches in the area of Data Mining, the Classification. A new Ensemble learning algorithm which is based on a metric-learning KNN classifier is proposed in this paper. Firstly, in a filtering procedure, we use a information gain based threshold to filter the input attributes, According to the evaluation of information gain of all the original inputs attributes, the values which are less than the threshold f are deemed as irrelevant and removed. Secondly, in an assembling procedure of bagging, we use both a regular boostrap way to reshuffle all the instances in the filtered dataset and a perturbation which randomly picks out the input attributes of the filtered dataset to form several component learners. In this way a strong ensemble can be generated with both high accuracy and diversity. Thirdly, in a metric-learning procedure, all the component learners is classified by a reformed KNN classifier called Neighbor Component Analysis, which learned the KNN distance-metric by a designed optimizing method. Finally, in a combining procedure, a majority-voting strategy has been used to colligate all the results which component learners produced.A large empirical study shows that this algorithm has a better performance compared to other simple metric-learning algorithms and simple ensemble KNN classifiers.
Keywords/Search Tags:Data Mining, Ensemble Learning, component Learner, Metric Learning
PDF Full Text Request
Related items