Font Size: a A A

Optimization of data accesses for database applications

Posted on:2006-09-05Degree:Ph.DType:Dissertation
University:University of Illinois at Urbana-ChampaignCandidate:Chen, ZhifengFull Text:PDF
GTID:1458390008474317Subject:Computer Science
Abstract/Summary:
Access speeds of the main memory and disks lag far behind the microprocessor speed. Consequently, disk and memory accesses pose significant performance bottlenecks for many applications. This dissertation investigates techniques that improve the effectiveness of buffer caches and processor caches to bridge these two speed gaps for database servers in a data center environment.; To address the disk bottleneck, this dissertation proposes the global management of the database-storage buffer cache hierarchy to deliver the performance comparable to that of the aggregate cache size. To manage buffer caches globally, this dissertation answers two challenging questions: (1) without the modifying the I/O interface, namely hierarchy-aware, how to collaborate database and storage caches globally; (2) with the extension of the I/O interface, namely aggressively-collaborative, whether the consequent performance improvement is worthwhile.; To answer the first question, this dissertation proposes the hierarchy-aware caching. This method tracks the eviction of database server buffer caches transparently. Upon the eviction, the storage server fetches the corresponding block from the disk selectively. The evaluation shows that this method improves the storage cache hit ratio and database transaction rate significantly.; To answer the second question, this dissertation explores empirically the design space for the database-storage collaborative caching. This design space has three dimensions: collaboration approaches, replacement algorithms and workload specific optimizations. Through both simulation and implementation, this dissertation evaluates 248 design combinations, which include all the previous proposed and many new solutions. The results indicate that the aggressively-collaborative caching only provides marginal performance improvement over the hierarchy-aware caching in all the tested cases. In short, the hierarchy-aware caching, without changing the I/O interface, can perform as well as the aggressively-collaborative caching.; To address the memory access bottleneck, this dissertation proposes Hanuman, which reformats data dynamically in the database buffer. By adapting data layouts to the changing workload, Hanuman improves the data spatial locality and the processor cache hit ratio accordingly. To determine the best data layout, Hanuman conducts the heuristic cost analysis for candidate layouts and chooses the best layout that minimizes the estimated cache misses. Our result indicates that Hanuman is effective and efficient.
Keywords/Search Tags:Database, I/O interface, Cache, Hanuman
Related items