Font Size: a A A

Research On Distributed Message Queue Based On RDMA And NVM

Posted on:2020-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:W LuoFull Text:PDF
GTID:2428330599953592Subject:engineering
Abstract/Summary:PDF Full Text Request
In a distributed environment,the introduction of message queue can effectively solve the communication problems between different subsystems,and at the same time reduce the coupling degree between systems,and ensure the asynchronous consumption of messages and traffic peak clipping.Existing message queues store messages in memory or on disk and transmit data over Ethernet.Failure to balance message persistence and high throughput.The new storage and network technologies that have emerged in recent years have brought about an opportunity to break through this dilemma.On the one hand,the new Non-Volatile Memory(NVM)has a memory-level read and write speed and byte-addressable;on the other hand,Remote Direct Memmory Access(Remote Direct Memmory Access,RDMA)can directly read and write the remote memory without taking up the remote CPU.This paper optimizes both the persistent storage of messages and the transmission mechanism of messages.Using the characteristics of NVM and RDMA,this paper proposes a distributed message queue FlashQ with both message persistence and high throughput,which ensures high reliable delivery of messages,high throughput and low latency messaging.The main contributions of this paper are as follows:(1)Propose a high-performance message storage technology based on NVM.Most existing message queues store messages in a file system based on a block device such as a disk to implement persistent storage of messages.Accessing messages requires a complex I/O software stack.FlashQ uses the NVM feature to access message files through the virtual address of the process,avoiding the slow I/O software stack.In addition,FlashQ maps message metadata into the process' s virtual address space for fast random locate the message position.(2)Propose a fast message lock-free transmission mechanism based on RDMA one side operation.Direct access to remote message data through RDMA WRITE,RDMA READ,without the participation of the remote CPU,to achieve high-throughput,low-latency message transmission.This paper divides the Topic into multiple Partitions to avoid remote write message conflicts in the message publishing process and achieve lock-free remote write messages.For the remote message writing process,this paper proposes an adaptive message batching strategy based on message production speed and transmission speed,which reduces transmission latency and improves transmission bandwidth.(3)Propose a fast indexing mechanism based on production time.FlashQ designed a message indexing mechanism based on message production time.By establishing a multi-accuracy hierarchical index structure and retrieval mechanism,it can quickly locate messages generated per second,thus improving the efficiency of message backtracking.(4)Design system disaster recovery mechanism and load balancing strategy to ensure high availability of FlashQ.For the system disaster recovery mechanism,FlashQ performs redundant backup on all message files and recovers quickly when the server has a single point of failure.For the load balancing policy,FlashQ distributes the Topic Partitions on different message servers as evenly as possible,avoiding the shortage of single-machine storage space and avoiding the degradation of transmission performance caused by the excessive number of single-machine connections competing for bandwidth.Finally,based on the proposed design,this paper implements distributed message queue FlashQ in Linux environment,and designs and implements the test tool for message production and consumption to carry out verification experiments.The experimental results show that when FlashQ transmits a commonly used message with a length of 1 KB,the publishing throughput of a single Topic Partition is close to 690,000/sec,which is 6.4 times that of Qpid(RDMA)and 7.5 times that of Kafka.With producing 1000 messages in a row,the average publishing latency is only about 13 microseconds,which is about 1/20 of Qpid(RDMA)in the same environment,about 1/1400 of Kafka;when it is consumed in push mode,the throughput is close to 690,000/sec,which is 6.4 times that of Qpid(RDMA);in the pull mode consumption,with only one message pulled at a time,the throughput reaches nearly 160,000 / sec,which is nearly 4 times that of Kafka.
Keywords/Search Tags:Distributed message queue, Persistence, NVM, RDMA
PDF Full Text Request
Related items