Font Size: a A A

Column-stores Join Optimization On Coupled CPU-GPU Architecture

Posted on:2017-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z T LiFull Text:PDF
GTID:2308330503453764Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the Internet era, enterprises’ data burst mode growth, how to better support the analysis and storage of mass data, has become an important issue of concern to the enterprise. Data warehouse is an important software tool to store and analyze massive data. Compared to the traditional relational database, which focus on transactions, data warehouse pays more attention to the query and analysis of massive data. Compared with the row storage technology, the column storage technology is more applied to data warehouse because of its advantage in the reading priority environment. Because the column storage system can support independent data storage, data compression and data operation, the column is more conducive to optimize the read operation.Integrated multicore and heterogeneous CPU-GPU architecture has become the trend of the development of the computer processor chip, in commercial computer,central processors based on the integration of multi core and heterogeneous CPU-GPU architecture have been more widely used. In this trend, it is very valuable to study how to use the data warehouse software query performance on integrated multi core and heterogeneous CPU-GPU architectures.This paper mainly study the optimization technique of the column storage join primitives on the coupled CPU-GPU architecture. This paper’s main contribution is as follows: :In this paper, we first study the parallel join algorithms based on the column storage system, and design and implement OpenCL based join algorithms for the column storage system. In addition, the data partition algorithm based on GPU is studied. In order to reduce its spatial complexity, we propose partition algorithm based on the statistics of the number of data partitions.Secondly, the method of using GPU as a co processor to accelerate data queryoperation is studied. After that, a new pipeline co processing mechanism based on the integrated CPU-GPU architecture is deeply studied, and a dynamic data allocation strategy is proposed for the problem that ratio of data can not be changed dynamically.Once again, we study and improve the column storage system which is developed by our laboratory. We implement the data distribution strategy in the system to optimize join operation by taking full advantage of the computing resources of the integrated CPU-GPU architecture processor.Finally, the validity of the method proposed in this paper is verified by using the SSB benchmark. Experimental results show that the proposed method makes two tables join efficiency increased by up to 33.2%, and the execution time of the standard query Q1.1 of SSB is reduced by 9.81%, and the execution time of Q3.1 is shortened by 7.03%, compared with the execution time of each query without using our method.
Keywords/Search Tags:heterogeneous chip, data pre-fetching, sort-merge join, query optimization, OpenCL
PDF Full Text Request
Related items