Materialized view selection for multidimensional datasets

Posted on:2000-02-01

Degree:Ph.D

Type:Dissertation

University:The University of Wisconsin - Madison

Candidate:Shukla, Amit

Full Text:PDF

GTID:1468390014464145

Subject:Computer Science

Abstract/Summary:

This dissertation describes techniques for speeding up Online Analytical Processing or OLAP queries. OLAP systems allow users to quickly obtain the answers to complex business queries. Quickly answering these queries which aggregate large amounts of data, calls for various specialized techniques. One technique used by OLAP systems to speed up multidimensional data analysis is to precompute aggregates on some subsets of dimensions and their corresponding hierarchies.; We first address the problem of efficiently estimating aggregate sizes. Precomputation of aggregate data improves query response time. However, the decision of what and how much to precompute is a difficult one. It is further complicated by the fact that precomputation in the presence of hierarchies can result in an unintuitively large increase in the amount of storage required by the database. Hence, it is interesting and useful to estimate the storage blowup that will result from a proposed set of precomputations without actually computing them. We propose three strategies to solve this problem, and investigate the accuracy of these algorithms in estimating the blowup for different data distributions and database schemas.; Another intriguing problem that we are faced with is which aggregates to precompute. The more that is precomputed, the faster queries can be answered; however, it is often difficult to determine which are the best aggregates to be precomputed given a fixed amount of space. We study the structure of the precomputation problem and show that under certain broad conditions on the multidimensional data, a simple and fast algorithm, PBS achieves good performance bounds. We present an empirical study of PBS that demonstrates that PBS picks a surprisingly good set of aggregates even when the conditions do not hold.; Queries in real world applications frequently require aggregations over multiple cubes (in a star schema, this corresponds to there being multiple fact tables). Unfortunately, most research into aggregate selection has assumed that queries are over a single cube. We analyze aggregate selection in the context of multicube queries, and propose algorithms that perform significantly better than previously proposed algorithms for multicube workloads, without any deterioration in performance for single cube query workloads.

Keywords/Search Tags:

Data, Queries, OLAP, Selection, Multidimensional

Related items

1	Multidimensional Data Modeling Based On Subjcets Integration An OLAP Analysis
2	Research On The Multidimensional Data Model And Aggregation Algorithm In LE-OLAP
3	Research On The Compression Method And CUBE Computation Of Multidimensional Data In Datawarehouse
4	Based On Data Warehousing, Olap System Design And Implementation
5	Technology Management Information Multidimensional Data Model And Olap Design,
6	Design And Implementation Of The Modeling Tool Of Multidimensional Data
7	Multidimensional Data Modeling Based On Multi-subjcets An OLAP Analysis
8	OLAP In The Research And Application Of Data Warehousing
9	The Design And Implementation Of Patent Multidimensional Data Analysis System For University Local Cooperation
10	Multidimensional Analysis Of Sales Management Applications