Skip to main content
Log in

Towards connection-scalable RNIC architecture

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Remote Direct Memory Access (RDMA) is a widely adopted optimization strategy in datacenter networking that surpasses traditional kernel-based TCP/IP networking through mechanisms such as kernel bypass and hardware offloading. However, RDMA also faces a scalability challenge with regard to connection management due to limited on-chip memory capacity in the RDMA Network Interface Card (RNIC). This necessitates the storage of connection context within RNIC’s memory and induces considerable performance degradation when maintaining a large number of connections. In this paper, we propose a novel RNIC microarchitecture design that achieves peak performance and scales well with the number of connections. First, we model RNIC and identify two key factors that degrade performance when the number of connections grows large: head-of-line blocking when accessing the connection context and connection context dependency in transmission processing. To address the head-of-line blocking problem, we then combine a non-blocking connection requester and connection context management module to process prepared connections first, which achieves peak message rate when the number of connections grows large. Besides, to eliminate connection context dependency in RNIC, we deploy a latency-hiding connection context scheduling strategy, maintaining low latency when the number of connections increases. We implement and evaluate our design, demonstrating its successful maintenance of peak message rate (66.4 Mop/s) and low latency (3.89 µs) while scaling to over 50,000 connections with less on-chip memory footprint.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Algorithm 1
Algorithm 2
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of data and materials

All data generated or analyzed during this study are included in this published article. And the relevant program can be found in [14].

References

  1. Zhu Yibo et al (2015) Congestion control for large-scale RDMA deployments. ACM SIGCOMM Comput Commun Rev 45(4):523–536

    Article  Google Scholar 

  2. Guo C et al (2016) RDMA over commodity ethernet at scale. In: Proceedings of the 2016 ACM SIGCOMM Conference

  3. Gao Y et al (2021) When cloud storage meets RDMA. In: 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21)

  4. Li Y et al (2019) HPCC: High precision congestion control. In: Proceedings of the ACM Special Interest Group on Data Communication. pp 44-58

  5. (2024). What is a hyperscale data center? Website. https://www.datacenterdynamics.com/en/analysis/what-is-a-hyperscale-data-center/

  6. Zamanian E et al (2016) The end of a myth: Distributed transactions can scale. arXiv preprint arXiv:1607.00655

  7. Chen Y, Lu Y, Shu J (2019) Scalable RDMA RPC on reliable connection with efficient resource sharing. In: Proceedings of the Fourteenth EuroSys Conference 2019

  8. Sidler D et al (2020) StRoM: smart remote memory. In: Proceedings of the Fifteenth European Conference on Computer Systems

  9. Xilinx (2022) ERNIC. Website. https://www.xilinx.com/products/intellectual-property/ef-di-ernic.html

  10. Schelten N et al (2020) A high-throughput, resource-efficient implementation of the RoCEv2 remote DMA protocol for network-attached hardware accelerators. In: 2020 International Conference on Field-Programmable Technology (ICFPT). IEEE

  11. NVIDIA (2022) Mellanox adapters programmer’s reference manual (PRM). Website. https://network.nvidia.com/related-docs/user_manuals/Ethernet_Adapters_Programming_Manual.pdf

  12. Wang X et al (2021) StaR: breaking the scalability limit for RDMA. In: 2021 IEEE 29th International Conference on Network Protocols (ICNP). IEEE

  13. Infiniband (2022) Infiniband architecture specification, Vol. 1. Website. https://cw.infinibandta.org/document/dl/8567

  14. NCSG Group (2022) csRNA. https://github.com/ncsg-group/csRNA

  15. Kang N et al (2022) csRNA: Connection-Scalable RDMA NIC Architecture in Datacenter Environment. In: 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE

  16. Kalia A, Michael K, Andersen DG (2016) Design guidelines for high performance RDMA systems. In: 2016 USENIX Annual Technical Conference (USENIX ATC 16)

  17. Mellanox Technologies (2023) mthca driver. Website. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/infiniband/hw/mthca

  18. Kalia A, Michael K, David A (2019) Datacenter RPCs can be General and Fast. In: 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)

  19. Neugebauer R et al (2018) Understanding PCIe performance for end host networking. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication

  20. Wang Z et al (2020) Shuhai: Benchmarking high bandwidth memory on fpgas. In: 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE

  21. Wang Z et al (2023) SRNIC: a scalable architecture for RDMANICs. In: 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)

  22. (2024). NVIDIA CONNECTX-5 INFINIBAND ADAPTER CARDS. https://nvdam.widen.net/s/pkxbnmbgkh/networking-infiniband-datasheet-connectx-5-2069273

  23. Mellanox Technologies (2023) mlx5 driver. Website. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/infiniband/hw/mlx5

  24. Lowe-Power J et al (2020) The gem5 simulator: Version 20.0+. arXiv preprint arXiv:2007.03152

  25. Monga SK, Kashyap S, Min C (2021) Birds of a feather flock together: scaling RDMA RPCs with flock. In: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles

  26. Mellanox Technologies (2022) libibverbs. Website. https://github.com/linux-rdma/rdma-core/tree/master/libibverbs

  27. (2024). NVIDIA CONNECTX-6. https://nvdam.widen.net/s/5j7xtzqfxd/connectx-6-infiniband-datasheet-1987500-r2

  28. Pan P et al (2019) Towards stateless RNIC for data center networks. In: Proceedings of the 3rd Asia-Pacific Workshop on Networking 2019

  29. Tsai S-Y, Yiying Z (2017) Lite kernel rdma support for datacenter applications. In: Proceedings of the 26th Symposium on Operating Systems Principles

  30. Singhvi A et al (2020) 1rma: Re-envisioning remote memory access for multi-tenant datacenters. In: Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication

  31. (2024). Dynamically connected transport. https://docs.nvidia.com/networking /display/bf3dpu/introduction

  32. Mittal R et al (2018) Revisiting network support for RDMA. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication

Download references

Funding

This work is supported in part by the National Key Research and Development Program of China (No. 2021YFB0300700), International Partnership Program of Chinese Academy of Sciences (No. 171111KYSB20180011), National Science and Technology Innovation Program 2030 (No. 2020AAA0104402), NSFC (No. 61972380), the National Key Research and Development Program (No. 2021YFB0300700), the Research and Innovation Program of State Key Laboratory of Computer Architecture, ICT, CAS (No. CARCH5407).

Author information

Authors and Affiliations

Authors

Contributions

Ning Kang, Guangming Tan, Guojun Yuan, and Zhan Wang wrote the main part of the manuscript. Ning Kang, Fan Yang, Zhenlong Ma, and Xiaoxiao Ma are responsible for the design and evaluation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zhan Wang.

Ethics declarations

Ethical Approval

Not applicable.

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kang, N., Wang, Z., Yang, F. et al. Towards connection-scalable RNIC architecture. J Supercomput (2024). https://doi.org/10.1007/s11227-024-05991-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-05991-4

Keywords

Navigation