Research Of Approximate Nearest Neighbor Search Based On Locality Sensitive Hashing

Posted on:2015-07-18

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Liu

Full Text:PDF

GTID:2298330431959856

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Approximate Nearest Neighbor (ANN) search in high dimensional space has become a fundamental paradigm in many applications, especially similarity search for multimedia data. Recently, Locality Sensitive Hashing (LSH) and its variants are acknowledged as the most promising solutions to ANN search. However, state-of-the-art LSH approaches suffer from a drawback that the access to candidate objects requires a large number of random I/O operations. In order to guarantee the quality of returned results, sufficient objects should be verified, which would consume enormous I/O cost.To address this issue, we propose a novel method, namely SortingKeys-LSH (SK-LSH), which reduces the number of page accesses through locally arranging candidate objects. We firstly define a new measure to evaluate the distance be-tween the compound hash keys of two points. A linear order relationship on the set of compound hash keys is then created, and the corresponding data points can be sorted accordingly. Hence, data points that are close to each other according to the distance measure can be stored locally in an index file. During the ANN search, only a limited number of disk pages among few index files are necessary to be accessed for sufficient candidate generation and verification, which not only significantly reduces the response time but also improves the accuracy of the re-turned results. Our exhaustive empirical study over several real-world data sets demonstrates the superior efficiency and accuracy of SK-LSH for the ANN search, compared with state-of-the-art methods, including LSB and C2LSH.

Keywords/Search Tags:

Approximate Nearest Neighbor Search, Linear Order Rela-tionship, Locality Sensitive Hashing

PDF Full Text Request

Related items

1	Locality Sensitive Hashing Index Based On Neighborhood Collision Counting
2	Reasearch On Locality Sensitive Hashing Based Approximate Nearest Neighbor(s) Searching Algorithm
3	Hash-based Approximate Nearest Neighbor Search For High-dimensional Data
4	Study On Approximate Nearest Neighbor Search Over Encrypted Data In Cloud Computing
5	Research On Local Sensitive Hashing And Approximate Nearest Neighbor Algorithm
6	Research Of Algorithm For Crowd Abnormal Behavior Detection Based On The Technology Of Approximate Nearest Neighbor Search
7	Approximate Nearest Neighbor Search For High-Dimensional Based On Nearest Neighbor Graph
8	Efficient Algorithms For Approximate Aggregation And Nearest Neighbor Queries Over Multi-Dimensional Data
9	Study On The Efficient Approximate Nearest Neighbor Search For Massive Data
10	Research On Approximate Nearest Neighbor Search And Maximum Inner Product Search For High-dimensional Dataset