Research On Parallel Query For In-memory Database

Posted on:2016-12-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Li

Full Text:PDF

GTID:2308330470457724

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

With the advent of large capacity memory and the lower price of memory, in-memory database begins to be widely used. The performance of database is improved, meanwhile new challenges are appearing. The speed growth of the memory is much slower than that of the processer, which leads to that memory access has become one of the bottlenecks for database query. The advent of mulit-core processers makes it even worse. Meanwhile, with the increase of in-memory data, the opportunity of appearence of error and imbalance data access increases. Query algorithms need to be more flexible and to have fault-tolerant capacity. The access confliction of shared cache when multithread access shared cache simultaneously will bring negative impacts to database performance. Besides, the limited memory bandwidth and load unbalance between cores could affect the efficiency of muttithread executing. Therefore, there are many technical problems of fully exploiting the capability of shared cache processers and reducing the access confliction to improve database performance need to be resolved.Query execution optimization, which is a challenging research topic, has always got widespread concern. This thesis studies join algorithms to improve the performance of query executing in shared cache system. The main contributions of this thesis can be summarized as follows:A strategy based on the data partitioning is proposed for multi-thread parallel join algorithm.For memory constrained server, we propose two join algorithms which are based radix-join and sort-merge join, and optimize them in multi-core system with shared cache. At partitioning phase, this paper proposes an adaptive multithreaded partitioning strategy. For aggregation phase, an execution strategy, which is based on the size classification and optimizes the memory access is proposed. The above optimization technologies significantly reduce load imbalance and confliction between processor cores, and improve the execution efficiency of the threads.We design and implement a flexiable join algorithm which is based on the MapReduce framework. Compared to traditional algorithms, it is much flexiable and has much stronger fault-tolerant capacity. It consisists Map and Reduce stages, and is adaptvie to the Radix-Join. We also dissect the internal phases of a typical in-memory Join algorithm and propose an effective hash Join algorithm for multi-core architectures using the MapReduce framework. To reduce the cost of tagging and sorting, data will be packed to be tagged in the new algorithm. Our experimental results show that our algorithm evidently outperforms existing Join algorithms for shared-memory architectures.

Keywords/Search Tags:

database query, mulit-core processers, In-memory Join, shared cache, packed data, MapReduce

PDF Full Text Request

Related items

1	Research On Optimization Of Database Query Execution For Shared Cache Chip Multi-Processor
2	Modeling Shared Cache Memory Accesses Of Multi-core Processors
3	Optimization Of Secondary Shared Memory In Heterogeneous Multi-core Systems For High-density Computing
4	Memory Optimization On Chip Multi-core Processors
5	The Optimization Of Hash Join Algorithm Based On KNL
6	Research On Optimization Of Main Memory Database Query Execution On Multi-core CPUs
7	Top-k Join Query Processing Method Based On MapReduce
8	Optimization Of Database Join Algorithms On DRAM/NVM-Based Hybrid Memory
9	Research And Implementation Of The Aggregate-Join Query Optimization Approach Based On Mapreduce
10	Research On Cache Coherence Protocols Based On Data Sharing Characteristics