Font Size: a A A

Research Of KNN Text Classification Algorithm Based On Area Division

Posted on:2013-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2248330374983017Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
KNN algorithm is a simple, effective and non-parametric classification method. However. However, the kNN algorithm needs to calculate the distance of test sample and each training sample. In order to improve the efficiency of classification, we proposed a fast kNN text classification algorithm based on area division. We divide the training set into several parts based on their area distribution, and then according to the relative positions between test patterns and those parts, we could easily find out k nearest neighbours of the test patterns in the training set. This will sharply cut down the amount of calculation of kNN algorithm.This paper discusses the following partition method:mesh method, clustering method, the same radius spherical method and the same sample spherical method. According to these methods to establish two kinds of pre-classifier model. On this basis, put forward an improved kNN algorithm. Mathematical reasoning and the experimental results both shows that this algorithm has significantly improved the efficiency of classification while keeping the same accuracy rate of kNN classifier algorithm.
Keywords/Search Tags:text classification, k-nearest neighbor algorithm, area division, pre-classifier
PDF Full Text Request
Related items