Font Size: a A A

Research On Userspace Communication Framework For Low-Latency Distributed Block Storage

Posted on:2020-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:H Z ZhangFull Text:PDF
GTID:2428330590958331Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Distributed block store systems play an important role in big data and high performance computing.The rapid development of network and storage technologies have enabled distributed block store systems to provide extremely low access latency.When the latency of network and storage device reaches microseconds,the overhead of software stack become an integral part of overall access overhead.In order to achieve lower latency and higher throughput,a lightweight Remote Procedure Call(RPC)framework,uRPC,is proposed for low-latency distributed block store systems.uRPC communicates based on Remote Direct Memory Access(RDMA)technology and uses a poll-mode-driving(PMD)thread model to avoid the overhead of thread scheduling in kernel.By employing PMD threading model,uRPC avoids the overhead of kernel thread scheduling.Aiming at the characteristics of the PMD threading model,a connection management scheme based on global connection pool is proposed,which realizes load balancing and fair scheduling between CPU cores.Meanwhile,an RDMA memory pool scheme based on CPU core parting is proposed to reduce the allocation overhead of RDMA memory.And,for the limitation of RDMA small block transmission,a message aggregation scheme is proposed to improve the throughput of the uRPC.Otherwise,for the data plane of distributed block storage,an RPC communication model optimization scheme based on RDMA immediate semantics is proposed to reduce the access latency and improve the throughput.Results show that the latency of uRPC requests is as low as 5 microseconds,and the throughput of a single server can reach 25 million times per second.The message aggregation scheme has increased the throughput of uRPC by about 2.5 times.In memory block storage,the RPC communication model optimization scheme reduces the 4 KB read request latency by approximately 40% while increasing the throughput by approximately 12%.
Keywords/Search Tags:Distributed block storage, Remote Procedure Call, Poll Mode Driver, Remote Direct Memory Access
PDF Full Text Request
Related items