Font Size: a A A

Multi-Query Optimization Strategy Design And Implementation In Column-based OLAP System

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:X C LuFull Text:PDF
GTID:2248330395480755Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the flexible feature of vertical split, column-stores have been proved to be more suitable than row-stores for query-intensive analytic applications. The operation of the analytic application OLAP can be mapped into a set of queries, which tend to be complex and long-time running. Although using column-stores to implement OLAP application is expected to obtain better execution performance, when it comes to multi-query optimization, the vertical split feature turn out to break the hierarchy of the dimension table, making the optimization strategy of row-stores not applicable. At the same time, the large amount of column join operations leads to the expansion of global plan space, making the traditional search algorithm not practical usable. To address these issues, according to the feature of column-stores and OLAP applications, this paper proposed a multi-query optimization strategy in column-based OLAP system, which not only retained the read-optimize property of column-stores but also achieved operation and middle result reuse for query set.The main contribution of this paper is as follows: According to the feature of column-stores and OLAP applications, proposed a series of transformation rules to generate a single global query plan for a set of related queries mapped from a certain OLAP require. Aimed at the sharing and reusing of operations, introduced four new defined nodes:the filter node, the group by node, the merge node and the aggregation node. At the same time, based on MuGA(Multiply Group by Algorithm) algorithm, utilize filter node, merge node and join node to mark group number for each tuple in dimension and fact tables, to achieve the share of column scan and column join operations. For the aggregation node, a multi-phase aggregation algorithm is proposed to implement effective aggregation reuse according to compound group number of the fact table.At the same time, based on DaMeng column store database management system, a multidimensional model was designed and basic OLAP operation SQL generation rules were proposed. The procedure of building the global plan was also introduced. Experiment shows that column-stores and multi-query optimization can help OLAP system to reduce the response time substantially when dealing with massive data.
Keywords/Search Tags:column store, OLAP, multi-query optimization, global query plan, operationreuse
PDF Full Text Request
Related items