Font Size: a A A

Research And Implementation Of Multi-Way Join Query Processing Algorithms Over Big Spatial Data In Cloud Environment

Posted on:2016-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:Q J WangFull Text:PDF
GTID:2428330542989578Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the information network represented by the Internet and mobile computing,the spatial data resources explode and accumulate.How to make use of cloud computing technology to efficiently process multi-way spatial join query for this kind of data has become one of the current hot research issue in the field of spatial data management.The traditional centralized approach cannot meet the demand of the rapid growth of big spatial data.The existing distributed multi-way spatial join query algorithms mainly are based on MapReduce,like the algorithm Controlled-Replicate.To some extent,it improves the efficiency of the multi-way spatial join query,but some problems such as weak filtration and excessive replication still exist which bring extra CPU and I/O costs.Aiming at the problems above,this thesis combing with the Spark which is the most popular cloud computing and big data processing platform currently,deeply studied the multi-way join query processing algorithms over big spatial data and mainly did the following work:1.This thesis put forward a multi-way spatial join query algorithm BSMW-1 which is based on grid filtering.Aimming at the problems of excessive duplication of the algorithm Controlled-Replicate,the algorithm filtered out the useless join objects by grid filtering,and reduced the redundant replication of join objects by narrowing the scope of replication.Finally it mproved the performance of multi-way join query processing in cloud environment.2.This thesis also proposed a iterative multi-way spatial join query algorithm BSMW-2.The algorithm BSMW-2 adoped the processing method of iteration,only calculated join query of two data sets every time,and then calculated join query for the intermediate result.It also has adopted meshing and coding for spatial data preprocessing.This optimation avoided puting a large amount of data in the memory one time,so as to speed up the query speed and improve the performance of the algorithm.3.Finally,through a large number of experiments of synthetic data sets,we analyzed and tested the two multi-way spatial data join query algorithms under the cloud environment.Experimental results show that the presented two multi-way join algorithms performs better than the Controlled-Replicate algorithm and has good adaptability.
Keywords/Search Tags:Cloud Computing, Spark, Multi-way Spatial Join Query, Grid Filtering
PDF Full Text Request
Related items