Font Size: a A A

On tuning and optimization for multiple queries in databases

Posted on:2003-10-18Degree:Ph.DType:Dissertation
University:University of California, Santa BarbaraCandidate:O'Gorman, KevinFull Text:PDF
GTID:1468390011481366Subject:Computer Science
Abstract/Summary:
Multiple concurrent queries occur in many database settings. This dissertation explores two such settings, one in which the system being examined generates the queries and one in which the system processes multiple incoming queries. In the former case, we originally set out to explore the performance of a particular algorithm for incremental maintenance of a materialized view in a data warehouse. In the process, we discovered that at a realistic database size, unexpected issues with the query and update processing had to be understood and dealt with before the results would be meaningful. The development of this understanding is as much to the point as the particular experimental results because they illuminate the significance of the experimental environment for research. The experiment proper validates that incremental maintenance is feasible over a wide range of update sizes (granularities ), and that in all cases a cursor-based version of the algorithm performs the best.; For the second part, we explore using middleware as an optimization tool for multiple concurrent queries. Observing that common subexpressions derive from common data, and that the amount of data is usually greatest at the source, we propose an optimization technique that exploits the presence of sharable access patterns to underlying data, especially scans of large portions of tables or indexes, in environments where query queuing or batching is an acceptable approach. We show that simultaneous queries with such sharable accesses have a tendency to form synchronous groups (teams) which benefit each other through the operation of the disk cache, in effect using it as an implicit pipeline. We propose exploiting this tendency by scheduling the queries to enhance this tendency, and show that this can be accomplished in some systems even from outside the database engine with application server middleware. We present an algorithm for scheduling from a queue of similar queries, designed to promote such teamwork. This is implemented as middleware for use with a commercial database engine. Finally, we present tests using the query mix from the TPC-R benchmark, achieving a speedup of 2.34 over the default scheduling provided by one database, but also show that the success depends on the details of the computing environment.
Keywords/Search Tags:Database, Queries, Multiple, Optimization
Related items