Performance Optimization For Multi-Table Joins In Main Memor Databases

Posted on:2020-12-14

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z H Fang

Full Text:PDF

GTID:1368330596967787

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Database development is driven by application requirements and hardware updates.In the current era of big data,real-time analysis for huge amounts of data is a typical application.Such application cannot be met by systems that mainly use the disk as the main storage medium,such as traditional disk databases and Hadoop system,because their analysis performance is dominated by expensive disk I/O.On the other hand,large main memory clusters provide the possibility to conduct real-time analysis on huge amounts of data,no matter considering the price and potential performance improvements.So improving the real-time analysis in main memory databases is an interesting issue.In databases,data analysis is manipulated through SQL query language.Among queries,the j oin operator is a characteristic of the database and is widely used.However,this operator is very time-consuming,especially for joining multiple tables.Therefore,it has drawn much attention in databases.To this end,this paper focuses on exploring ways to accelerate the performance of multi-table joins in main memory databases.Specifically,a multi-join query consists of several pipelines.Its performance can be accelerated from the aspects of inter-pipeline parallelism and intra-pipeline parallelism.The main work and contributions of this paper are as follows.(1)A real-time task scheduler.From the point of view of the independent parallelism,parallelizing multiple pipelines provides the possibility to speed up the execution of multi-table j oin queries.However,during the execution of multiple pipelines in the main memory database,hardware resources are seriously wasted and the execution order of pipelines is suboptimal.To solve these two problems,we propose two dy-namic scheduling technologies:adaptive filling and rank-based preemption.These two techniques guide the implementation of a real-time task scheduler for main mem-ory databases.The scheduler consists of two layers.The global layer arranges the execution order of pipelines and schedules some extra pipelines to fill idle resources.The local layer coordinates resource allocation among multiple pipelines to increase local resource utilization.These two levels work together to avoid wasting resources to shorten the execution time of multiple pipelines in the entire query.(2)A vectorized multi-table probe.Among the multiple pipelines generated from multi-table j oin queries,the one probing multiple tables is very time-consuming,up to 90%of the whole join process.On the other hand,wider SIMD vectors provide larger data parallelism to speed up complex operations.Therefore,this paper explores the coding method and implementation of the SIMD vectorized multi-table probe.This paper summaries the existing coding methods in vectorization.In order to eliminate the scalar tail,the whole-procedure vectorization,a coding method,is proposed to fully use the vectorization technology.Based on this new coding method,this pa-per discusses how to use SIMD instruction to speed up the multi-table probe,then designs and implements the vertical and horizontal vectorized multi-table probe op-erators.The vectorized multi-table probe is more efficient and can greatly reduce the execution time of multi-table j oins.(3)Interleaved Multi-Vectorizing.Each pipeline in the multi-table join query fre-quently accesses hash tables thus faces the performance bottleneck of memory ac-cesses.This problem is even more serious in vectorization because if a cache miss occurs from one of the data elements contained in a vector,the execution of data elements in the entire vector is stalled.Therefore,this paper explores how to apply data prefetching to vectorize joins to reduce cache misses,forming the interleaved multi-vectorizing technique.This technique executes multiple instances of vector-ized codes interleavingly,and when a memory access occurs in one instance of the execution,it simply issues data prefetching instructions and then switches to other execution instances,expecting to overlap the data accesses with the operations from the multiple execution instances.In addition,to solve the control flow divergence in vectorization,this paper proposes the residual vector state to integrate with the diver-gent vector state,eliminating the bubbles in the vector.This technology can break through the memory access bottleneck in vectorization and release the advantages of vectorization,so it can significantly improve the performance of vectorized j oins.In conclusion,this paper speeds up the multi-table j oins in main memory databases from the perspective of task scheduling,vectorization and memory accesses.Through multiple experiments on the standard benchmarks,this paper proves the correctness and effectiveness of the proposed methods.This paper lays a foundation for implementing a whole-process vectorized execution engine,in order to avoid performance bottlenecks such as cache misses,branch misprediction and calculation overhead in the existing exe-cution engines,releasing the advantages of main memory databases.

Keywords/Search Tags:

Memory Databases, Multi-Table Joins, Task Scheduling, Vectorization, Memory Accesses

PDF Full Text Request

Related items

1	Static analysis of memory accesses in multithreaded and recursive programs
2	Design And Optimizations Of Multi-table Join Operations For NVM-based In-memory Databases
3	Research On Task Scheduling And Memory Management Based On Multi-Core Processor
4	The Research On Spark Task Scheduling Strategy Based On Dynamic Memory Awareness
5	Research And Implementation Of Task Scheduling Mechanism Based On Embedded System
6	Scheduling Research For Memory Controller Based On Density Of Memory Access
7	Data-aware Task Scheduling On Heterogeneous Hybrid Memory Multiprocessor Systems
8	Research On Key Technologies Of Large-scale Graph Processing
9	Design And Implementation Of Parallel Multi-access Memory Interface
10	Research Of In-Memory MapReduce System For Memory Efficiency Optimization