Font Size: a A A

Design And Implementation Of Zero Copy RPC Over RDMA

Posted on:2021-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:H P ZhouFull Text:PDF
GTID:2428330647951073Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of web services and cloud computing,traditional TCP/IP network stack can hardly meet the networking performance requirement of the next generation of data center applications.Remote Direct Memory Access(RDMA)avoids massive CPU overhead caused by complicated software stack through offload-ing network stack to hardware.It can provide high throughput while keeping network-ing latency low.Since Ro CEv2 specification came out,RDMA has been deployed in more and more data centers.Nevertheless,the complexity of RDMA verbs programming interface becomes a barrier to the further application of RDMA technology in data centers.Data center applications have a desire for performance advantage of RDMA as well as high level abstractions for ease of programming.Remote procedure call(RPC)is one of the most common communication pattern used in data centers.In this paper,we present a zero-copy RPC framework over RDMA called zRPC.z PRC achieves both high throughput and low latency through careful design of every single component in the RPC frame-work and optimizing for the characteristic of RDMA.This paper consists of the follow-ing three parts:RDMA memory management mechanism called rmalloc.Rmalloc has comparable performance with generic memory allocator.It provides applications with generic and flexible memory registration mechanism through memory registration hooks and context.Our benchmark shows that rmalloc strikes a balance between memory usage and throughput of memory operations.It can achieve much higher throughput than several existing RDMA memory management mechanisms under multithread-ing memory allocation workload.Zero-copy serialization mechanism called zFlat Buffers.zFlat Buffers completely eliminates memory copy operations during serialization.It can share data between messages efficiently through redesign of buffer hierarchy and extension to Flat-Buffers,which further reduces memory usage and CPU overhead during serial-ization.Our benchmark shows that zFlat Buffers can effectively reduce the total amount of memory allocation during serialization,and it reduces serialization time by 44.11% comparing to Flat Buffers.zRPC message transport mechanism.It provides capability of message transport over RDMA with low CPU overhead and low latency.It improves the resource ef-ficiency of RDMA through adopting shared receive queue,thus improves the scala-bility of reliable connection channel.By using scatter/gather list and one-sided op-erations,zRPC message transport mechanism avoids memory copy during message transport.Our performance evaluation shows that the throughput of zRPC transport mechanism can scale linearly with message size until it gets bottlenecked by net-work bandwidth,while still keeping moderate networking latency.Furthermore,we also evaluate the zRPC framework as a whole.The performance evaluation shows that zRPC can outperform e RPC in terms of both throughput and latency.In the scenario of log replication,the performance of zRPC can scale with the num-ber of concurrent connections with the help of the zero copy message embedding mechanism provided by zFlat Buffers.
Keywords/Search Tags:Data Center Network, Remote Direct Memory Access, Memory Management, Serialization, Remote Procedure Call
PDF Full Text Request
Related items