Skip to main content
Log in

CCRP: Converging Credit-Based and Reactive Protocols in Datacenters

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

As the link speed has grown steadily from 10 Gbps to 100 Gbps, high-speed data center networks (DCNs) require more efficient congestion management. Therefore, proactive transports, especially credit-based congestion control, nowadays have drawn much attention because of fast convergence, near-zero queueing and low latency. However, in real deployment scenarios, it is hard to guarantee one protocol to be deployed in every host at one time. Thus, when the credit-based protocols are deployed into DCNs incrementally, the network will convert to multi-protocol state and face the following fundamental challenges: (i) unfairness, (ii) non-convergence, and (iii) high buffer occupancy. In this paper, we propose a new protocol, called CCRP, aiming for converging credit-based and reactive protocols in data centers. Targeting the mostly deployed protocol, i.e. DCQCN based on explicit congestion notification (ECN), in DCNs, CCRP leverages the forward ECN to detect the network congestion in data queue and optimizes feedback control of the credit-based transports. Our experiment results show that this design can address the unfair link allocation and converge with reactive protocols rapidly. Furthermore, CCRP achieves high utilization and low buffer occupancy at the same time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Jose, L., et al.: High speed networks need proactive congestion control. In: Proceedings of HotNets, pp. 1–7. (2015). https://doi.org/10.1145/2834050.2834096

  2. Singh, A., et al.: Jupiter rising: a decade of clos topologies and centralized control in google’s datacenter network. Commun. ACM 45, 188–197 (2016). https://doi.org/10.1145/2785956.2787508

    Article  Google Scholar 

  3. Wilson, C., et al.: Better never than late: meeting deadlines in datacenter networks. In: Proceedings of SIGCOMM, pp. 50–61. (2011). https://doi.org/10.1145/2018436.2018443

  4. Wu, H., et al.: ICTCP: Incast Congestion Control for TCP in Data-Center Networks. In: Proceedings of CoNEXT, pp. 1–12. (2010). https://doi.org/10.1145/1921168.1921186

  5. Eran, H., et al.: Congestion control for large-scale RDMA deployments. In: Proceedings of SIGCOMM, pp. 523–536. (2015). https://doi.org/10.1145/2785956.2787484

  6. Alizadeh, M., et al.: Data center TCP (DCTCP). In: Proceedings of SIGCOMM, pp. 63–74. (2010). https://doi.org/10.1145/1851182.1851192

  7. Mittal, R., et al.: Timely: RTT-based congestion control for the datacenter. In: Proceedings of SIGCOMM, pp. 537–550. (2015). https://doi.org/10.1145/2785956.2787510

  8. Hong, C., et al. Finishing Flows Quickly with Preemptive Scheduling. In: Proceedings of SIGCOMM, pp. 127–138. (2015). https://doi.org/10.1145/2377677.2377710

  9. Gao, P., et al. phost: Distributed near-optimal datacenter transport over commodity network fabric. In: Proceedings of CoNEXT, pp. 1–12. (2015). https://doi.org/10.1145/2716281.2836086

  10. Perry, J., et al.: Fastpass: A centralized-zero-queue datacenter network. In: Proceedings of SIGCOMM, pp. 307– 318. (2014). https://doi.org/10.1145/2619239.2626309

  11. Cho, I., et al.: Credit-scheduled delay-bounded congestion control for datacenters. In: Proceedings of SIGCOMM, pp. 239–252. (2017). https://doi.org/10.1145/3098822.3098840

  12. Jiang, N., et al.: Network congestion avoidance through speculative reservation. In: Proceedings of HPCA, pp. 1–12. (2012). https://doi.org/10.1109/HPCA.2012.6169047

  13. Montazeri, B., et al.: Homa: A receiver-driven low-latency transport protocol using network priorities. In: Proceedings of SIGCOMM, pp. 221–235. (2018). https://doi.org/10.1145/3230543.3230564

  14. Zhang, Y., et al.: BDS: a centralized near-optimal overlay network for inter-datacenter data replication. In: Proceedings of EuroSys, pp.1–14. (2018). https://doi.org/10.1145/3190508.3190519

  15. Mittal, R., et al.: Revisiting network support for RDMA. In: Proceedings of SIGCOMM, pp. 313–326. (2018). https://doi.org/10.1145/3230543.3230557

  16. Michelogiannakis, G., et al.: Channel reservation protocol for over-subscribed channels and destinations. In: Proceedings of HPCA, pp. 52:1–52:12. (2013). https://doi.org/10.1145/2503210.2503213

  17. Nan, J., et al.: Network endpoint congestion control for fine-grained communication. In: Proceedings of SC, pp. 35:1–35:12. (2015). https://doi.org/10.1145/2807591.2807600

  18. Judd, G., et al.: Attaining the promise and avoiding the pitfalls of TCP in the datacenter. In: Proceedings of NSDI, pp. 145–157. (2015). https://doi.org/10.5555/2789770.2789781

  19. He, K., et al.: AC/DC TCP: Virtual congestion control enforcement for datacenter networks. In: Proceedings of SIGCOMM, pp. 244–257. (2016). https://doi.org/10.1145/2934872.2934903

  20. Alizadeh, M., et al.: pFabric: minimal near-optimal datacenter transport. In: Proceedings of SIGCOMM, pp. 435–446. (2013). https://doi.org/10.1145/2486001.2486031

  21. Fall, K., et al.: Simulation-based comparisons of (tahoe, reno and sack tcp). In: Proceedings of SIGCOMM, pp. 5–21. (1996). https://doi.org/10.1145/235160.235162

  22. Zats, D., et al.: DeTail: reducing the flow completion time tail in datacenter networks. In: Proceedings of SIGCOMM, pp. 139–150. (2012). https://doi.org/10.1145/2377677.2377711

  23. Lee, C., et al. Accurate Latency-based Congestion Feedback for Datacenters. In: Proceedings of USENIX ATC, pp. 403–415. (2015). https://doi.org/10.1109/TNET.2016.2587286

  24. Ha, S., et al.: CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Op. Syst. Rev. 42, 64–74 (2008). https://doi.org/10.1145/1400097.1400105

    Article  Google Scholar 

  25. Hu, S., et al.: Augmenting proactive congestion control with aeolus. In: Proceedings of APNet, pp. 22–28. (2018). https://doi.org/10.1145/3232565.3232567

  26. OMNeT++: discrete event simulator. http://omnetpp.org/

  27. INET Framework. https://inet.omnetpp.org/

  28. Varga, A., et al.: An overview of the OMNeT++ simulation environment. In: Proceedings of SIMUTools, pp. 1–10. (2008). https://doi.org/10.1145/1416222.1416290

  29. Kung, H., et al.: Credit-based flow control for ATM networks: credit update protocol, adaptive credit allocation, and statistical multiplexing. In: Proceedings of SIGCOMM, pp. 101–114. (1994). https://doi.org/10.1145/190314.190324

  30. Yang, X., et al.: A dos-limiting network architecture. In: Proceedings of SIGCOMM, pp. 241–252. (2005). https://doi.org/10.1145/1080091.1080120

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their insightful comments. We gratefully acknowledge members of Tianhe interconnect group at NUDT for many inspiring conversations. The work was supported by the National Key R&D Program of China under Grant No. 2018YFB0204300.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dezun Dong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, Y., Hu, D., Dong, D. et al. CCRP: Converging Credit-Based and Reactive Protocols in Datacenters. Int J Parallel Prog 49, 685–699 (2021). https://doi.org/10.1007/s10766-021-00698-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-021-00698-y

Keywords

Navigation