Introduce Redis Over RDMA protocol#12217
Conversation
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Redis Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Introduce Redis Over RDMA protocol as a new transport for Redis. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
|
Hello @pizhenwei , I'm very interested in you proposal, and the protocol seems very novel comparing to other rdma implementation like brpc and NVMe-oF. I also have some questions about the proposal, hoping I didn't miss anything.
In general I think the protocol and implementation is neat and beautiful, it deserve more attention, for the sake of research/study or production. |
Hi, I tried to describe the deference and comparing to other protocols, please see link.
Just imagine a QP(RC type) as a connection of TCP/TLS/Unix socket. If a client uses N sockets, it may need N QPs. (in fact, many sockets also waste resources in the kernel).
Currently, only one memory region per QP is defined. And no strict memory region size limitation in protocol. As far as I can see in the engineering implementation:
For example, transfer 10MB string over 1MB memory, this works like:
Thanks! |
|
So sad, years of waiting have made me lose patience and confidence. |
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Redis Over RDMA has a ~2.5X QPS and lower latency.
In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used.
Introduce Redis Over RDMA protocol as a new transport for Redis. For now, we defined 4 commands:
The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same):
With this version of protocol, we achieve goals: