Font Size: a A A

Fast Approximate K Nearest Neighbours Search Based On Set Compression Tree And Best Bin First

Posted on:2018-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z J ChenFull Text:PDF
GTID:2428330596489109Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
In the fields of machine learning,computer vision,image processing,k-nearest neighbors search has been widely used.It is a basic foundation of many algorithms and it is also the most time consuming part.With the development of science and technology,a large number of high dimensional data generat.Scientific research workers more and more focus attention on how to find an approximate k-nearest neighbors in the high dimensional space quickly.The effects of many traditional k nearest neighobors methods are not very useful due to the “curse of dimensionality”.This paper briefly introduces the present situation and research direction of fast near-neighbor search at home and abroad.Secondly,this paper expounds the theory of k-nearest neighbors,and introduces the k-nearest neighbors search based on tree structure represented by KD tree and Hierarchical k-means tree(HKM).Then,we focus on the Set Compression Tree(SCT)algorithm and Best Bin First(BBF)algorithm,and compare their advantages and disadvantages.In this paper,a new algorithm named SCT-BBF is proposed,which combines the advantages of SCT and BBF.Finally,we compare all the performance of the SCT-BBF,KD tree and HKM tree algorithms on the MNIST dataset and the SIFT1 M dataset respectively.The innovation of this paper is as follows: SCT-BBF uses PCA to reduce dimensions to low dimensional space.Relevant SCT parameters are generated by the training set,so the speed of tree construction is fast.BBF algorithm can be used to find approximate k nearest neighbors in a very short time.Finally,we return to the high dimensional space again to optimize the search result and improve the accuracy of the approximate k nearest neighbors.SCT-BBF can overcome the "curse of dimensionality" brought by high dimensional data to a certain degree.Futhermore,it has been proved an effective and feasible algorithm with high overall speed and high accuracy.
Keywords/Search Tags:Fast approximate nearest neighbors, Set Compression Tree, Best Bin First, high dimensional space
PDF Full Text Request
Related items