Font Size: a A A

The Optimization Of The Query Execution Engine In Column Oriented DWMS

Posted on:2015-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:D T HaoFull Text:PDF
GTID:2268330425981883Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As information technology continues to evolve, people have to deal with the explosive growth of data. In order to better analyze large amounts of data, data warehouse system (DWMS) came into being. Data warehouse is used for data analysis. Therefore, comparing to OLTP database system, data warehouse is a read optimization system. Column oriented data warehouse system is more adaptive to read optimization environment comparing to traditional data warehouse system. The lab work is trying to develop a column oriented data warehouse system. The author is developing the execution engine of the system.This paper is based on the developing of query exection engine in DWMS. DWMS is a column oritented data warehouse system. This paper is trying to research the implementation and optimization problems in query execution engine. This paper studies three main issues.First is the architecture and implementation of execution engine. This paper proposes a new hash join method to solve hash collision problems by building index in bucket. This index is built on buckets which have too many collisons in them. When probing in this bucket, this index can help improve the probing efficiency.Second is the optimization with column store related technology like ulitizing B+tree index to achive high efficiency selection operation. This paper also tries to integrate Bitmap index into query execution engine. Another important one is to try implementes direct execution on compressed data without decompression. Considering the code explosion problem of achiving direct exection on compressed data, this paper only explores the selection and tuple reconstruction operation on compressed data.Third is the optimization of query execution on multi-core processors. This paper focuses the optimization on aggregation operation that tryes to improve the partition method in aggregation. This paper propose a new dynamic partition method by sampling which make it more adaptive to different characteristic data which achives high efficiency utilization of CPU Cache.
Keywords/Search Tags:column store, query execution, hash join, CPU Cache
PDF Full Text Request
Related items