Font Size: a A A

Design And Optimizations Of Multi-table Join Operations For NVM-based In-memory Databases

Posted on:2019-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:X C LiFull Text:PDF
GTID:2428330566477995Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Emerging non-volatile memory(NVM),such as phase change memory(PCM)has attracted much attention in both industrial and academic due to its features of low power consumption,high storage density,fast read/write speeds,byte addressability,etc.It is also promising replace DRAM,building a new system architecture based on NVM.Of course,NVM(PCM for example)has many disadvantages at the same time,such as limited write tolerance,asymmetry in read and write speeds,and so on.The emergence of new non-volatile memory allows us to store the database directly on the NVM to build an NVM-based memory database system architecture.Multi-table join operation is a common and very important operation in the database,which will generate a large number of intermediate tables,causing a large number of write operations to the storage device.The traditional multi-table join optimization algorithms do not take into account the features of NVM,including the limited write tolerance,byte addressability and other features,so they do not apply to this new architecture.For this reason,this paper proposes a "NVM-friendly" multi-table join algorithm for NVM-based in-memory databases.The goal is to make full use of the advantages of NVM and to reduce write operations on NVM as much as possible to extend the life of NVM.We first proposes an NVjoin algorithm,which can optimize the join order by parsing the correlation between tables,estimating the size of intermediate results,to minimize write operations on NVM.Secondly,we propose a lightweight data structure to organize the intermediate results during multi-table join,called LWTab,which can fully exploits the byte-addressable nature of NVM.Because of this,we can further reduce NVM writes that result from intermediate results during the join process.Combining these two technologies,we have the NVjoin+LWTab algorithm.In order to determine the probability of sampling estimation in the algorithm,we conduct a large number of experiments,the experimental data are divided into three different types,which are Zipf distribution data,positive distribution data and uniform distribution data.And the sampling probability is determined to be 0.1.Finally,we conduct contrast experiments,comparing different multi-table join methods to measure that whether the NVjoin+LWTab algorithm can achieve significant results in both reducing NVM write operations and increasing elapsed time.Analyzing the experimental results,we have two conclusions: 1)A proper sampling probability not only helps to obtain a better join order,but also does not affect the overall elapsed time of join processing;2)Cartesian product have significant impact on both the total size of intermediate results and the overall elapsed time during join process.Because NVjoin+LWTab fully considers the correlation between tables,minimizes unnecessary Cartesian product operations,and avoids redundant copies of redundant data,it can reduce NVM writes as much as possible while guaranteeing performance.The experimental results show that compared with the join order provided by MySQL,NVjoin can reduce 104.21 times NVM writes.In addition,LWTab can further reduce 16.74 times NVM writes based on NVjoin.In terms of elapsed time,NVjoin+LWTab is on average 87.24% faster than the MySQL multi-join method.
Keywords/Search Tags:Non-Volatile Memory, Multi-table join, Join order, Database
PDF Full Text Request
Related items