Font Size: a A A

Research On Parallel Query For In-memory Database

Posted on:2016-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2308330470457724Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the advent of large capacity memory and the lower price of memory, in-memory database begins to be widely used. The performance of database is improved, meanwhile new challenges are appearing. The speed growth of the memory is much slower than that of the processer, which leads to that memory access has become one of the bottlenecks for database query. The advent of mulit-core processers makes it even worse. Meanwhile, with the increase of in-memory data, the opportunity of appearence of error and imbalance data access increases. Query algorithms need to be more flexible and to have fault-tolerant capacity. The access confliction of shared cache when multithread access shared cache simultaneously will bring negative impacts to database performance. Besides, the limited memory bandwidth and load unbalance between cores could affect the efficiency of muttithread executing. Therefore, there are many technical problems of fully exploiting the capability of shared cache processers and reducing the access confliction to improve database performance need to be resolved.Query execution optimization, which is a challenging research topic, has always got widespread concern. This thesis studies join algorithms to improve the performance of query executing in shared cache system. The main contributions of this thesis can be summarized as follows:A strategy based on the data partitioning is proposed for multi-thread parallel join algorithm.For memory constrained server, we propose two join algorithms which are based radix-join and sort-merge join, and optimize them in multi-core system with shared cache. At partitioning phase, this paper proposes an adaptive multithreaded partitioning strategy. For aggregation phase, an execution strategy, which is based on the size classification and optimizes the memory access is proposed. The above optimization technologies significantly reduce load imbalance and confliction between processor cores, and improve the execution efficiency of the threads.We design and implement a flexiable join algorithm which is based on the MapReduce framework. Compared to traditional algorithms, it is much flexiable and has much stronger fault-tolerant capacity. It consisists Map and Reduce stages, and is adaptvie to the Radix-Join. We also dissect the internal phases of a typical in-memory Join algorithm and propose an effective hash Join algorithm for multi-core architectures using the MapReduce framework. To reduce the cost of tagging and sorting, data will be packed to be tagged in the new algorithm. Our experimental results show that our algorithm evidently outperforms existing Join algorithms for shared-memory architectures.
Keywords/Search Tags:database query, mulit-core processers, In-memory Join, shared cache, packed data, MapReduce
PDF Full Text Request
Related items