Font Size: a A A

The Research Of A MapReduce-Based KNN Algorithm In Spatial Database

Posted on:2013-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:B LiuFull Text:PDF
GTID:2248330371970761Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, with the development of LBS(Location Based Services) and mobile internet, the amount of geospatial data is rapidly growing. The increasing volume of spatial data, however, poses novel problems to traditional indexing mechanisms which usually assume an in-memory index or optimize for local disk access. How to accomplish efficient index and query processing on large scale spatial data becomes new requirements and challenges towards cloud environments. A scalable and distributed spatial data index is a best choice for the effective processing of the spatial analysis and query. There are several approaches that implement distributed indices and query processing with MapReduce, such as R-tree and Voronoi-based index. However, R-tree is unsuitable for parallelization and query processing on Voronoi-based index needs extra computation for localization or local index reconstruction. This paper researches the problem of the spatial data index and approaches of the spatial analysis and query, and then puts forward an improved approache.This paper is the first detailed attempt in designing a distributed inverted spatial grid index in the cloud computing context and using it to process spatial kNN queries with MapReduce. The main contributions in this paper are as follows:(1) This paper proposed a distributed Inverted Grid Index for given data points in 2D space, which meets the standard of spatial index:dynamic and simple. Because of its loose coupling and shared nothing architecture, the inverted grid index is more suitable for large-scale parallel spatial query application with MapReduce. (2) This paper puts forward a approach of constructing the inverted grid index with MapReduce and propose a novel kNN query algorithm MRCircleTrip, which is based on our index structure. Besides, this paper also gives the mathematical proof of the convergence of the kNN query algorithm to ensure the correctness of the algorithm. (3) This paper also did a lot of experiments about constructing index and kNN query on Inverted Grid Index to verify the scalability of the proposed index structure and kNN query performance. The results show that the constructing time of the index proposed in this paper is obvious less than R-tree and Voronoi-based index and the scalability also outperforms the other two kinds of index. The kNN query algorithm in this paper is at least three times faster than Voronoi-based query processing.
Keywords/Search Tags:Spatial index, kNN query, Grid index, MapReduce
PDF Full Text Request
Related items