Font Size: a A A

Research On Optimization Of Database Query Execution For Shared Cache Chip Multi-Processor

Posted on:2010-09-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y D DengFull Text:PDF
GTID:1118360305473654Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of hardware technology, the memory price is more and more cheap, and large memory capacity has become the standard database server configuration. This not only reduces the disk I/O cost in database query execution, but also promotes the universal application of main memory database. The performance of database is improved, meanwhile new problems are appearing. As the speed growth of processor is far greater than that of the memory, processor has to spend a lot of time waiting for data fetched from the memory to the cache, and memory access has become one of the main bottlenecks to database performance improvement. Similarly, the advent of CMP brings to the database performance improvement, but also the challenges. Firstly, many databases still adopt query algorithms based on single-threaded pattern, and database can't make full use of the parallel computing resources of CMP; Secondly, cores in CMP always share some of the resources, such as cache and memory bandwidth. In the condition that main memory access has become the main cost of database, the access confliction of shared cache when multithread access shared cache simultaneously will bring negative impacts to database performance improvement; Thirdly, the limited memory bandwidth and load unbalance between cores could affect the efficiency of multithread executing. Therefore, if we want to fully exploit the capability of shared cache CMP to improve database performance, in one aspect we have to optimize the performance of query execution through multithreaded parallel execution; in another, we have to improve the cache performance of multithreaded execution, especially we have to reduce the access confliction of shared cache. At present, the optimization of database query execution based on CMP is still in the preliminary stage of research, and there are many technical problems need to be resolved.Query execution optimization has always got widespread concern since the prevalence of database, and it is a challenging research direction. The purpose of this dissertation is to optimize the performance of query execution based on the shared cache CMP. On the basis of summarizing and analyzing corresponding research results, aiming at the performance bottlenecks of query executing in shared cache CMP, and to meet the needs of performance improvement of query execution, this dissertation studies some kind of query execution types with large cost, such as sorting, join, and index access. The main works and innovations are as follows:(1) We propose a multithreaded database sorting execution framework optimized by shared cache CMP. This framework adopts Inplaced Flash QuickSort (IFQS) algorithm. Aiming at different characteristics of data access in the three phases of IFQS, we respectively present multithreaded execution patterns and the corresponding cache optimization strategies in each phase, focuses are especially put on reducing access confliction of the shared cache.(2) We present a multithreaded hash join execution framework based on data partition strategy and adopting Radix-Join algorithm, which is composed of cluster partition phase and cluster join phase. Through deep analysis of performance bottlenecks when running multithreaded Radix-Join in shared cache CMP, we present an adaptive multithreaded cluster partition strategy to optimize the performance of cluster partition. Focusing on the cluster join phase, we present a multithreaded cluster join strategy based on cluster size classification, and optimize memory access of cluster join. The above optimization could reduce access confliction of shared cache and load unbalance when multiple threads are running in the framework.(3) For index nested loop join, based on shared cache CMP, we present a shared cache sensitive executing framework of index nested loop join (SCS-INLJPEF). SCS- INLJPEF adopts multithread execution pattern based on pipeline, and set operations in the pipeline according to data operating node in the query execution plan. A shared cache sensitive buffer structure is presented to manage buffers in SCS-INLJPEF. Through memory access cost model of SCS-INLJPEF, we allocate computing resources of CMP reasonably to the operators in SCS-INLJPEF. This allocation could not only reduce load unbalance, but also reduce access confliction of shared cache.(4) For non-index nested loop join (NINLJ), based on shared cache CMP, we present a multithread execution framework of NINLJ. This framework is composed of cluster partition phase and cluster join phase. Focusing on the cluster partition phase, we set reasonable starting time of cluster partition thread to reduce access confliction of shared cache; as the cluster join phase has poor cache behaviors, we utilize the advantages of very small average size of cluster and cluster join thread access clusters sequentially to present multithreaded cluster join execution strategy. This strategy utilizes preload thread to improve cache performance of cluster join thread, and by setting rational parameters of preload thread we make this strategy fit different core number of CMP, also reduce access confliction of shared cache.(5) We present a CSB+-Trees access module of multithreaded execution pattern based on pipelined (CSBT-MAM) execution. CSBT-MAM sets operators in the pipeline according to the hierarchy of the CSB+-Trees. Through the process analysis of CSBT-MAM, we present the memory access model, and then based on the cost model, we could reasonably partition nodes in the CSB+-Trees, and set operator number and corresponding work sets based on the cost model to improve cache performance of index access thread. Also, we allocate computing resources of CMP according to the node partition. Finally, CSBT-MAM is used to optimize the index nested loop join.
Keywords/Search Tags:Optimization of database query execution, Shared cache chip multi-processor, Cache access performance, Multithreaded execution, Database sorting, Hash join, Nested loop join
PDF Full Text Request
Related items