Font Size: a A A

Research On Construction Of RDMA End-host System For Multi-application Datacenters

Posted on:2022-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:H X SongFull Text:PDF
GTID:2518306725481354Subject:Computer technology
Abstract/Summary:PDF Full Text Request
RDMA is increasingly deployed in data center to meet the demands of ultra-low latency,high throughput and low CPU overhead.However,it is not easy to migrate existing applications from the TCP/IP stack to the RDMA.On the one hand,though the specification is provided,the implementation details of the hardware-based network stack are hidden to users.System developers and network maintainers have to devote a lot of effort and time on measuring and evaluating the performance of the “black-box”through phase-by-phase configuration optimization so as to understand the behavior of commercial RDMA devices.After operating the high-speed RDMA network,we identify multiple hidden costs which may cause degraded and/or unpredictable performance of RDMA-based applications.We demonstrate these hidden costs including the combination of complicated parameter settings,scalability of Reliable Connections,two-sided memory management and page alignment,resource contention among diverse traffics,etc.On the other hand,existing RDMA technologies are ill-suited to multi-tenant datacenters,where applications run at massive scales,tenants require isolation,and the workload mix changes over time.Using microbenchmarks and real open-source RDMA applications,we identify a series of performance anomalies when multiple applications coexist.They arise due to a fundamental tradeoff between performance isolation and work conservation.In the first work of this paper,we presented the detailed analysis of how the commodity hardware implementation affects the performance of data transmission in RDMA network.We pointed out multiple issues related to software implementation.Furthermore,to address these problems,we introduce a RDMA middleware,a suite that allows developers to maximize the benefit of RDMA by i)eliminating the resource contention at NIC cache through asynchronous resource sharing;ii)introducing hybrid page management based on messages sizes;iii)isolating flows of different traffic classes based hardware features.We implement the prototype of the RDMA middleware and verify its effectiveness by rebuilding the RPC message service,which demonstrates the high throughput for large messages,low latency for small messages without compromising the low CPU utilization and good scalability performance for a large number of active connections.In the second work of this paper,to overcome the limitation of hardware Qo S capability,we proposed a RDMA Qo S system to schedule all RDMA traffic in the software layer.The system measures the accurate queuing delay of latency-sensitive messages at end hosts,and modulate the sending rate of bandwidth-sensitive messages according to the queuing delay of latency-sensitive ones.In this way,for small messages,the system only needs to achieve two timestamps,without any additional scheduling overhead.For large messages,the RDMA Qo S system maximizes the link bandwidth utilization as much as possible under the premise of ensuring the target low latency of small messages.RDMA Qo S system can ensure that the average tail delay of small messages are less than 20 microseconds in a cluster of 20 machines under various complex scenarios,and the overall link bandwidth utilization of RNIC is more than 90%.
Keywords/Search Tags:RDMA, Scalability, QoS
PDF Full Text Request
Related items