Font Size: a A A

Research On Distributed Spatial Connection Query Based On MapReduce

Posted on:2014-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2270330467488846Subject:Surveying and Mapping project
Abstract/Summary:PDF Full Text Request
In recent years, with the accelerated pace of information technology, geospatial informationacquisition technology are advancing with each passing day. Meanwhile, the increasing scalegeospatial data has become an important source of huge amounts of data. Spatial join query is acommon and very time-consuming complex spatial query operations, especially in dealing withlarge-scale spatial data sets, because of the traditional stand-alone system and MPI cluster systemare both difficult to meet its demand for time and space overhead, therefore, how to designefficient distributed spatial join query algorithm in the cloud computing environment has becomea research focus in academia and industry.This paper first attempts to put forward a distributed QR-tree index structure in the cloudcomputing environment and on the basis of the index designs distributed spatial join queryalgorithm with MapReduce. In this paper, main work is as follows.(1) A distributed QR-tree index structure which can support large-scale of data sets in thecloud computing environment is proposed and the process of it’s building is introduced indetail.The process of the construction of the distributed QR-tree index can be divided into thefollowing two steps: First of all, the spatial data set is divided according to the spatial datapartition based on quadtree and distributedly stored in HDFS data blocks; R-tree index is built inparallel in the data block of each child area which has been divided into.(2) On the basis of building distributed QR-tree index, combining distributed QR-tree indexstructure with distributed and parallel computing framework MapReduce, designing andimplementing distributed spatial join query algorithm QRSJ-MR with MapRedcue. In addition,based on the index of the concurrent access problem in the algorithm, using real-time cachingmechanism to optimize concurrent access to the index.(3) Building Hadoop cluster environment and testing the efficiency of distributed spatial joinalgorithm QRSJ-MR based on the MapReduce. In spatial overlap join query and spatial containjoin query, this paper respectively does performance comparison tests with non-indexed spatialjoin with MapReduce algorithm and based on R-tree spatial join algorithm with MapReduce.The experimental results show that: compared with non-indexed spatial join query algorithmwith MapReduce and based on R-tree index spatial join query algorithm with MapReduce,whether in spatial overlap join query or spatial contain join query, QRSJ-MR algorithm hashigher efficiency.
Keywords/Search Tags:HDFS, MapReduce, Spatial join query, QR-tree
PDF Full Text Request
Related items