Clustering Algorithm For Index Structure In KNN Search

Posted on:2020-06-22

Degree:Master

Type:Thesis

Country:China

Candidate:K S Guan

Full Text:PDF

GTID:2428330596495060

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the continuous development of information technology,various kinds of data information continue to enrich people's spiritual and material life.And people are paying more and more attention to extracting effective information using algorithms in data mining.k nearest neighbor algorithm(kNN algorithm)is one of the classical algorithms in the field of data mining,and it has been widely used in the fields of scientific and engineering as an efficient non-parametric technique.Recently,reducing the amount of computation in the k nearest neighbor algorithm has become a hotspot of research topics.Because of that,the algorithm has to traverse each object of the data set,which leads to the algorithm consumes a lot of computing resources when processing the data.In order to solve the above problems,many research works recently focus on the preprocessing of data,that is,constructing the index structure for the data sets before the kNN query.The research works aim to find the k nearest neighbors of the query object by calculating a part of the data set.This paper proposes a new index structure(i.e.SCA algorithm)for kNN query for spatial network query.The SCA algorithm divides the road segment into multiple road segments according to the position of the fixed route,and models the road segment into a weighted directed graph.Then the SCA algorithm clusters the road segments into multiple sub-road segments according to the speed of the moving object.The SCA algorithm can quickly update objects and reduce the consumption of data updates.This paper proposes a kNN scan(SCAkNN)algorithm based on the SCA clustering algorithm.The SCAkNN algorithm can quickly determine the region containing k nearest neighbors,and determine candidate sub-segments according to the range of these regions.Then the algorithm marks the objects in the candidate sub-segments as candidate objects.Finally the algorithm finds the k nearest neighbors of the query object from the candidate objects.In the experiments,we select the vehicle driving data of three different roads to verify the effectiveness of the proposed algorithm.And the experimental results show that,the proposed algorithm can effectively reduce the object clipping rate,reduce the computational cost of the object,and avoid the uncertainty of the algorithm iteration number.In the free space query,this paper proposes a width-weighted clustering algorithm based on the number of objects(NOWC algorithm).The algorithm first uses the FWC algorithm to divide the data set into clusters with the same radius.Then the algorithm calculates the cluster width weight,according to the number of objects and the harmonic coefficient in each cluster.Finally,the algorithm iteratively divides the cluster until the number of objects in the cluster is not greater than the threshold.The algorit hm adjusts the weight of the cluster by the number of objects and the harmonic coefficient in the cluster,to balance the size of the cluster.Besides,this paper proposes a k nearest neighbor search algorithm(NOWCkNN algorithm)based on the NOWC algorithm.Finally,we use 8 public datasets in UCI database to verify the proposed method.Experimental results show that,compared with the existing methods,the proposed algorithm can reduce the number of cluster iterations,reduce modeling time,and increase the object pruning rate in the kNN query,especially for data sets with high dimensions and large data volume.Meanwhile,the algorithm is universal,so that it can achieve better query results in different data sets with different dimensions.

Keywords/Search Tags:

Clustering, Index structure, k-nearest neighbor, Road networks

PDF Full Text Request

Related items

1	Research On Methods For Mnn Query And Maxbrnn Query In Road Networks
2	K-Nearset Neighbor L-Close Friends Search On Road-Social Networks
3	Study On K Nearest Neighbor Queries Of Moving Objects Based On Road Networks
4	Research On Processing Of Nearest Neighbor And Reverse Nearest Neighbor Queries Over Moving Objects In Road Networks
5	Study On K Nearest Neighbor Query Based On Road Networks In Mobile Database
6	Research Of The Efficient Index Structure For Approximate Nearest Neighbor Algorithm
7	Research And Implementation Of Continuous Nearest Neighbor Query Processing Technology In Time-dependent Road Networks
8	Locality Sensitive Hashing Index Based On Neighborhood Collision Counting
9	Research On Nearest Neighbor Query Of Moving Objects With Uncertain Velocity In Road Network
10	Research On Clustering Algorithm Based On The Nearest Neighbors