Font Size: a A A

RDMA-based Distributed Memory Database Query Engin

Posted on:2019-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:C ChenFull Text:PDF
GTID:2348330563953917Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The growing demand for data storage and the demand for massive data processing in the Internet era have led to the development of distributed database systems that have become the latest hot issue in the industry.Distributed database has excellent scalability,can effectively use multi-core and multi-machine computing and storage capabilities.At the same time,because memory prices have been declining and operating systems can also support larger address spaces in recent years,the distributed memory database that uses memory as its storage engine is implemented.However,because of the natural complexity of the traditional TCP/IP network protocol stack,the gap between memory read speed and network transmission speed is increasing.Network IO replacement disk IO has become a new bottleneck in the distributed memory database system.It affects The scalability of distributed systems,and restricts the performance of distributed memory databases.As remote direct memory access(RDMA)technologies,which have higher throughput and lower latency than TCP/IP networks,are maturing,the way to leverage RDMA technology to improve the network environment in distributed memory database systems becomes more and more popular.Based on Goldfish,a distributed memory database system independently developed by Lab,we replace the traditional TCP/IP network in the query engine with RDMA network technology,and design and implement distributed memory database query engine based on RDMA to improve the performance of distributed query Engine data transmission speed of the task,reducing the query time in this paper.The main work of the dissertation is as follows:1)Studying various data sending and receiving modes in RDMA network,analyzing its main advantages and disadvantages,using RDMA Verbs API,designing and implementing two RDMA-based high throughput or low latency data transmission framework.2)Because RDMA needs to register memory with the RDMA network in advance as a transmit or receive buffer before sending or receiving data,one advantage of RDM A technology is that it avoids application and kernel copy of data.A set of buffer memory pool management strategies are designed and implemented to quickly allocate buffers.3)The distributed memory database executor based on high throughput and low latency RDMA network sending frame and buffer memory pool is designed and implemented,which can accept the execution plan delivered by the query optimizer and ensure the task is executed quickly and accurately.Finally,based on the RDMA network sending framework,this paper implements the echo server and compares it with the echo server using TCP over InfiniBand on the Mellanox NIC as the underlying network.Test results show that RDMA-based echo server have significantly higher throughput and lower latency than clear TCP-based echo server.At the same time,At the same time,Goldfish-RDMA based on RDMA implementation is compared with Goldfish-TCP based on TCP implementation and sparkSQL which is a open source big data analysis engine.Goldfish-RDMA has better query performance than Goldfish-TCP and SparkSQL in TCP-H data set.
Keywords/Search Tags:distributed memory database, search engin, RDMA, memory pool
PDF Full Text Request
Related items