Font Size: a A A

An Improved KNN Classification Algorithm Based On Region Partitioning

Posted on:2017-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:J W HuFull Text:PDF
GTID:2358330503486137Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data classification is an important part of data mining, mainly used to extract the classification model from the data set. The goal of data classification is to build a classifier which can predict the class of the data. KNN is a widely used classification algorithm with high accuracy, simple principle, which is easy to achieve. Besides, it can be used to model the data spaces with multiple dimensions. However, it requires a large amount of calculation to run the KNN classification algorithm on the whole training set,besides, the amount of calculation grows exponentially with the increment of the dimensions.In order to improve the efficiency of the KNN algorithm, we propose two improved KNN classification algorithms based on the Hyper-sphere region partition and the Hyper-cube region partition to satisfy different situations. Firstly, in the training phase,divide the training set into several regions according to the distribution of training set with a specified partition method, then build a primary classifier based on the regions;Secondly, in the testing phase, let the primary classifier determine the new training set,then run the KNN algorithm on the new training set to classify the test samples. So, the amount of calculation of the KNN algorithm largely decreases due to the result of the number of the samples in new training set is much less than that of original training set.The improved KNN algorithm based on Hyper-sphere region partition uses the Simulated annealing algorithm to control the number of the spheres in order to reduce the amount of calculation; The improved KNN algorithm based on Hyper-cube region partition reduces the number of the dimensions to control the number of the cubes in order to lower the amount of calculation and the occupied storage.The experiments which use improved KNN and the classical KNN to classify the test samples on seven data sets demonstrate the proposed schemes are efficient.
Keywords/Search Tags:classification, KNN, region partition, Hyper-sphere, Hyper-cube
PDF Full Text Request
Related items