Font Size: a A A

Research On Nearest Neighbor Method Based On Sparse Representation And Decision Tree

Posted on:2017-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:M ZongFull Text:PDF
GTID:2308330488475459Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Classification is one of the most basic and important issues in data mining, it has a wide range of applications in computer vision, natural language processing, biometrics, medical diagnostics and other fields. Nearest Neighbor algorithm is a common classification algorithm, that is, given a test sample, it identifies the similar training samples to learn how to classify the test sample. k Nearest Neighbor (kNN) is to search for the sample space, so as to find k nearest training samples of the unknown sample for classification. As a classical classification algorithm, kNN is one of the most widely used classification algorithms in data mining, it is widely used in face recognition, handwritten character recognition and so on. However, kNN also has deficiencies, thus the improvement of kNN has been a hot research topic. How to determine the value of k is an open issue, this article mainly focuses on how to determine the value of k, the work of this thesis is as follows:1) It is unreasonable in reality that kNN uses a fixed value of k for all test samples, i.e., using the same number of training samples for classification. We will introduce sparse representation into kNN, proposing a novel algorithm named CM-kNN to solve the fixed k value problem by sparse reconstruction. That is, for different test samples, user should use different number of training samples to classify them in the training sample space, so that it is more realistic.2) compared to classical kNN algorithm, performance of CM-kNN has improved a lot, but when facing a large dataset, the reconstruction process is still very time-consuming. To this end, we introduce decision tree, proposing a kTree algorithm. The introduction of decision tree can avoid the reconstruction process, thus greatly improving the speed of classification, while maintaining a similar classification accuracy of CM-kNN.3) Based on the kTree algorithm, we consider k nearest neighbor samples of the leaf node of the decision tree and the nearest neighbor of the k nearest neighbor. We propose k* Tree algorithm, which conducts kNN classification within this small area, thus not only further speeding up its computation time, but also ensuring that the accuracy is similar to CM-kNN.
Keywords/Search Tags:Nearest neighbor, k nearest neighbor, sparse representation, reconstruction, decision tree
PDF Full Text Request
Related items