Font Size: a A A

An integrated approach to OLAP optimization

Posted on:2004-11-15Degree:Ph.DType:Dissertation
University:University of MichiganCandidate:Nadeau, Thomas PFull Text:PDF
GTID:1468390011975032Subject:Computer Science
Abstract/Summary:
The goal of on-line analytical processing (OLAP) is to answer queries quickly from large amounts of data residing in a data warehouse. Recent literature in the area of improving OLAP response time has focused on materialized views as the means of achieving this goal. Materialized views are precomputed answers saved to disk. The problem of improving OLAP response time can be divided into four sub-problems: view size estimation, selection of views for materialization, materialized view maintenance, and query optimization in the presence of materialized views. This dissertation contributes research to each of these four areas, describes how the processes interact, and how they can be efficiently integrated. This research serves as a plan for the implementation of a commercial OLAP system with faster query response and much greater scalability properties than previous systems.; The contributions include the Pareto Model Algorithm for view size estimation, the Polynomial Greedy Algorithm for view selection, the Paged Bin-Tree for data maintenance, and the semantic search of view metadata for query optimization. The Pareto Model Algorithm for view size estimation reduces the coefficient of variation for the estimate/actual measurement when compared to prior art. This improvement in view size estimation results in better decisions by the view selection algorithm since views are selected for materialization based on the relative benefits. The contribution of the Polynomial Greedy Algorithm provides a means for selecting a near optimal set of views for materialization in polynomial time relative to the number of dimensions. The polynomial time complexity represents scalability beyond the operational capacity of existing algorithms. Views selected for materialization can be quickly created and updated with the invention of the Paged Bin-Tree index structure. The Paged Bin-Tree provides efficient update and query performance, significantly improving response time. Finally, the intelligent use of metadata is utilized to improve query optimization in the presence of materialized views. The semantic search for the best data source is straightforward and reduces the number of query rewrites compared to previous approaches.; These improvements to OLAP performance create opportunities for OLAP users to explore their data more quickly and fully. The scalable design opens the possibility of analyzing a wide variety and size of data sets beyond the ability of previous state of the art systems.
Keywords/Search Tags:OLAP, Data, View size estimation, Algorithm for view, Materialized views, Optimization
Related items