Font Size: a A A

Architecture-sensitive database query processing

Posted on:2005-03-14Degree:Ph.DType:Thesis
University:Columbia UniversityCandidate:Zhou, JingrenFull Text:PDF
GTID:2458390008999553Subject:Computer Science
Abstract/Summary:
During the last decade, microprocessors have experienced tremendous improvement. This architectural growth has not been equally distributed over all aspects of hardware performance. Recent advances in the speed of commodity CPUs have far outpaced advances in memory latency. Main memory access is therefore becoming a significant cost component of database operations. Database systems face new performance bottlenecks, such as memory access and poor utilization of sophisticated execution hardware. Research has shown that the DBMS hardware behavior is suboptimal, compared with scientific workloads. This illustrates the importance of rethinking and developing database query processing algorithms in the context of new computer architectures.; This thesis focuses on studying interactions between DBMMs and modern processor architecture. We design techniques to exploit architectural innovations and to alleviate performance bottlenecks throughout the CPU and the memory hierarchy, including caches, memory, and disks.; The first part of this thesis presents a novel technique to implement database operations with a high degree of intra-instruction parallelism. We describe new SIMD instructions commonly available in commodity processors and show how database operations can be accelerated using SIMD instructions. Database research has demonstrated that the dominant memory stalls are due to the data cache misses on the second-level cache and the instruction cache misses on the first-level instruction cache. We address both issues in the second part of this thesis. We propose buffering techniques to improve the data cache performance of index structures and to improve the instruction cache performance of query processing in database systems. The main benefit derives from better data and instruction reference locality. Our techniques can be easily integrated into current database systems without significant changes. The final part of this thesis describes a new storage model called MBSM (Multi-resolution Block Storage Model) for laying out tables on disks. Compared to other data storage models, MBSM has both good I/O performance and good cache utilization in main-memory.
Keywords/Search Tags:Database, Cache, Memory, Performance, Query
Related items