Research And Implementation Of Parallel Query Processing In Column-store

Posted on:2015-03-12

Degree:Master

Type:Thesis

Country:China

Candidate:G H Zhang

Full Text:PDF

GTID:2268330425981987

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the popularity of the network and the arrival of the information age, people’s daily life has been facing enormous data. How to establish a data warehouse system around these data, and then to do data mining and date analysis is becoming a hotspot of data processing. These data has a higher requirement for the speed of query. So the traditional row-store system cannot adapt to the requirements of modern mass data any more. However, the column-store system can provide the underlying storage model for massive data processing.In recent years, the microprocessor has developed rapidly. Due to the limitation of processor power consumption and design, the development trend of the CPU is gradually moving from high-frequency single-core to multi-core processors. Single-core processor almost disappeared in modern processor market, while CMP (on-chip multi-core processors) has become the mainstream in the market. Multi-core processors can provide hardware environment available for the parallel query processing.The main contents of this paper are the design and implementation of parallel query column-store system. In column-stores D WMS developed by our laboratory, we analyzed the existing query technology, then designed and achieved a set of parallel query module. First, we choose the various stages of a query processing to make analysis for parallel query optimization at each stage. For example, in the hash-join stage, multiple joins can hash operations simultaneously. After we analyzed query execution mode based on pass block, established the pass block buffer for Pipelined Parallel Processing. The way of transmitted data changes from pass block to pass block buffer, so that each node only requests data from the buffer. This achieves a separation between father and son nodes. Through the effective management of the buffer, we can improve DWMS query performance. Finally, we make an analysis of the parallel design of the entire query. In order to Improve query efficiency further, we will set the relevant parameters, the number of buffers and parallel modules.In our parallel multi-core environment, we have a multithreading design for the DWMS data warehouse system, and this design mainly includes parallelization and Pipelined of nodes. Through the theoretical analysis and relative experiments, the design of query parallelization can improve the query effectiveness of DWMS.

Keywords/Search Tags:

Column-store, Multi-core processors, Pass block buffer, Parallelization, Multithreading

PDF Full Text Request

Related items

1	Research On Some Key Technologies For Column-stores
2	Research On Topology Reconfiguration Algorithms For Many-Core Processors With Core-Level Redundancy Mechanism
3	Expansion And Implementation, Based On The Parallelization Of General-purpose Image Library Of Multicore Processors
4	Column Store Database---A New Approach to GIS Application
5	Research On The Optimization Of The Schedule Engine Of The Oriented-column Database Based On The Multi-core Processors
6	Research And Design Of Embedded Operating System Based On Multi-Core Processors
7	Research On Inter-communication For Multi-core Processors
8	Parallelization Of Simulation Model Portability Specification On Multi-core Computer
9	Research On Database Optimization And Realization Based On Simulative Column-store
10	Dynamic thermal management of multi core processors using core hopping