Font Size: a A A

Hadoop Based Efficient Join Algorithm Research On GPU

Posted on:2017-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:J N LiFull Text:PDF
GTID:2348330503487177Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In today's commercial database system, data store and queries face severe problems because of the increacing data. More and more researches focsu on how to execute user queries on big data set efficiently. As the development of CPU slows down, the majorization on software has come to its limit. The area of GPU query processing has got more and more attention. As the strong computational capabilities and the high parallelism, GPU specilize in dealing with computation-intensive tasks. Join operation is one of the most important operation in commercial database system, many reserchers focus on accelerating join oepration on GPU.Existing methods of accelerating Nested Loop Join oepration on GPU cannot deal with big data set. In this paper, we focus on how to execute Nested Loop Join on big data set efficiently on GPU. We integrate Hadoop with GPU, and implement Nested Loop Join, Hash Join and theta Join. In our method, only joinable tuples will be transmited to GPU for actual join operation.This paper is the first one to accelerate theta join on GPU combined with Map-Reduce. And after data pre-filtration, our method can deal with bigger data set than existing approaches of accelerating equi join on GPU.Our approach can estimate the size of join result more precisely without extra costs, and allocate proper space for the result.Precise experiments shows that our method can achieve 0.5x to 1x speedup than traditional GPU based equi join processing methods. We also do experiments on synthetic data set, our GPU method can achieve 0.3x to 1x speedup than CPU method.
Keywords/Search Tags:GPU, Hadoop, equi join, theta join
PDF Full Text Request
Related items