Font Size: a A A

Research On Adaptive Graph-based Nearest Neighbor Search Algorithm

Posted on:2022-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z WangFull Text:PDF
GTID:2518306779464174Subject:Enterprise Economy
Abstract/Summary:PDF Full Text Request
Approximate nearest neighbor search is a basic topic in information retrieval technology and is widely applied to databases,recommender system and other fields.Compared with exact nearest neighbor search algorithm,approximate nearest neighbor search algorithm has a smaller memory usage and sacrifices a smaller query recall to achieve extremely fast query speed.Graph-based approximate nearest neighbor search algorithm is one of the most commonly used approximate nearest neighbor search algorithms.Compared with algorithms based on space partition,hashing,and quantization,it has been widely used by major commercial companies due to its fast query speed and high query recall.The goal of graph-based search algorithms is to build a high-quality graph-based index structure to achieve fast and accurate search.In recent years,scholars have been committed to building high-quality graph-based index structures and using index structures to improve query efficiency.However,all query points use fixed query termination parameters when querying,which will cause some query points to generate extra query time and make the query inefficient.In response to the above problems,this paper proposes the learning adaptive early termination search algorithm and the early termination search algorithm based on offline K-Means training on the index structure of the Link and Code algorithm.The common goal of the two algorithms is to use the trained model to predict different query termination parameters for different query points and perform early termination of the query,so that the query efficiency is effectively improved.The two algorithms are as follows:(1)The learning adaptive early termination search algorithm needs to use the intermediate results generated during the query and the query vector as features,and then selects the Gradient Boosting Decision Tree(GBDT)model for feature training to obtain the final prediction model.Finally,it is integrated into the graph-based index structure for query and prediction;(2)The early termination search algorithm based on offline K-Means training,first performs K-Means clustering on the training set to obtain the cluster centers,and uses the ratios of the distances between the query point and the different cluster centers as features.Second,the neural network is selected for feature training to obtain the prediction model.The prediction model is used for offline regression prediction to obtain query termination parameters before the start of the query and finally the query is performed on the graph-based index structure.Both algorithms finally integrate the prediction model into the graph-based index structure to predict and terminate the query operation early to achieve a more efficient query.In this paper,experiments have been conducted on six kinds of datasets such as SIFT and DEEP,and the two algorithms have been fully verified respectively.All experimental results show that the learning adaptive early termination search algorithm and the early termination search algorithm based on offline K-Means training have reduced the average query time by 76% and 77%,which is improved by 4.1 times and 4.3 times,respectively.And compared with the benchmark algorithm,the two algorithms have achieved a large improvement.Finally,this paper compares the features and models of the two algorithms,and combines with the analysis of experimental results on datasets,and finally verifies that the early termination search algorithm based on offline K-Means training has more advantages.
Keywords/Search Tags:approximate nearest neighbor search, graph-based index structure, adaptive search processing
PDF Full Text Request
Related items