Font Size: a A A

The impact of early grouping and user-defined functions on query optimization

Posted on:2000-01-01Degree:D.ScType:Thesis
University:University of Massachusetts LowellCandidate:Chiou, Shao-FongFull Text:PDF
GTID:2468390014461808Subject:Computer Science
Abstract/Summary:
On-line Analytical Processing (or OLAP) is a new class of query processing for large-scaled database systems. It provides a quick, responsive way for the users of Decision Support System (DSS) to navigate through the large amount of data in big organizations. To achieve the required performance, the frequently requested aggregate queries are precomputed (or materialized) and stored in a centralized repository, called Data Warehouse. Due to the large amount of data, the query optimizers must devise optimal plans for the computation of these materialized views to meet the user requirements.; Traditional two-phase optimization approach for aggregate queries, i.e., optimizing the query without considering the GROUP BY and aggregation, and appending the aggregation on the resulting plan in the former process, is not guaranteed to produce optimal plans. A new technique that evaluates the GROUP BY operators early in query optimization provides more opportunities for the optimizers to find the optimal plans. However, pushing down the GROUP BY operator also increase the search space dramatically. The first part of the thesis is to derive heuristics that will reduce the search space in a cost-based optimization.; The second part of the thesis extends the optimizer's ability to generate plans for queries with holistic aggregate functions, using the early grouping technique. One difficulty is that the evaluation of holistic functions cannot be started until all data are collected, which is not compatible with the early grouping technique. In the early grouping approach, data are evaluated by partitions and the results of the partitions are merged in the final stage. The thesis provides a method to start the evaluation by partially aggregating the input data even not all data are completely collected.; The third part of the thesis enhances the database system to allow users define their own grouping attributes. This contribution allows users to generate new information from the existing data. The new ability, however, increases the burden to the optimizer to find an optimal plan when the attributes involved are derived from some other attributes. The thesis provides a new evaluation method and a thorough cost analysis of the new model. It offers the optimizers new opportunities to generate more cost efficient plans.
Keywords/Search Tags:Query, New, Early grouping, GROUP, Data, Plans, Functions, Optimization
Related items