AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

Harmonia: Explicit Congestion Notification and Credit-Reservation Transport Converged Congestion Control in Datacenters

College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China

A preliminary version of the paper was published in the Proceedings of ICPADS 2019.

Show Author Information

Abstract

Bursty traffic and thousands of concurrent flows incur inevitable network congestion in datacenter networks (DCNs) and then affect the overall performance. Various transport protocols are developed to mitigate the network congestion, including reactive and proactive protocols. Reactive schemes use different congestion signals, such as explicit congestion notification (ECN) and round trip time (RTT), to handle the network congestion after congestion arises. However, with the growth of scale and link speed in datacenters, reactive schemes encounter a significant problem of slow responding to congestion. On the contrary, proactive protocols (e.g., credit-reservation protocols) are designed to avoid congestion before it occurs, and they have the advantages of zero data loss, fast convergence and low buffer occupancy. But credit-reservation protocols have not been widely deployed in current DCNs (e.g., Microsoft, Amazon), which mainly deploy ECN-based protocols, such as data center transport control protocol (DCTCP) and data center quantized congestion notification (DCQCN). And in an actual deployment scenario, it is hard to guarantee one protocol to be deployed in every server at one time. When credit-reservation protocol is deployed to DCNs step by step, the network will be converted to multi-protocol state and will face the following fundamental challenges: 1) unfairness, 2) high buffer occupancy, and 3) heavy tail latency. Therefore, we propose Harmonia, aiming for converging ECN-based and credit-reservation protocols to fairness with minimal modification. To the best of our knowledge, Harmonia is the first to address the trouble of harmonizing proactive and reactive congestion control. Targeting the common ECN-based protocols—DCTCP and DCQCN, Harmonia leverages forward ECN and RTT to deliver real-time congestion information and redefines feedback control. After the evaluation, the results show that Harmonia effectively solves the unfair link allocation, eliminating the timeouts and addressing the buffer overflow.

Electronic Supplementary Material

Download File(s)
jcst-36-5-1071-Highlights.pdf (947.2 KB)

References

[1]
Jose L, Lan L, Alizadeh M et al. High speed networks need proactive congestion control. In Proc. the 14th ACM Workshop on Hot Topics in Networks, November 2015, Article No. 14. DOI: 10.1145/2834050.2834096.
[2]
Cho I, Jang K, Han D. Credit-scheduled delay-bounded congestion control for datacenters. In Proc. the ACM Special Interest Group on Data Communication, August 2017, pp.239-252. DOI: 10.1145/3098822.3098840.
[3]
Kabbani A, Alizadeh M, Yasuda M et al. AF-QCN: Approximate fairness with quantized congestion notification for multi-tenanted data centers. In Proc. the 18th IEEE Symposium on High Performance Interconnects, August 2010, pp.58-65. DOI: 10.1109/HOTI.2010.26.
[4]
Gusat M, Crisan D, Minkenberg C et al. R3C2: Reactive route and rate control for CEE. In Proc. the 18th IEEE Symposium on High Performance Interconnects, August 2010, pp.50-57. DOI: 10.1109/HOTI.2010.17.
[5]
Alizadeh M, Greenberg Albert, Maltz D et al. Data center TCP (DCTCP). In Proc. the 2010 ACM SIGCOMM Conference, August 30-September 3, 2010, pp.63-74. DOI: 10.1145/1851182.1851192.
[6]
Wu H, Feng Z, Guo C et al. ICTCP: Incast congestion control for TCP in data-center networks. In Proc. the 2010 ACM Conference on Emerging Networking Experiments and Technology, November 30-December 3, 2010, Article No. 13. DOI: 10.1145/1921168.1921186.
[7]
Zhu Y, Eran H, Firestone D et al. Congestion control for large-scale RDMA deployments. In Proc. the 2015 ACM Conference on Special Interest Group on Data Communication, August 2015, pp.523-536. DOI: 10.1145/2785956.2787484.
[8]
Alizadeh M, Kabbani A, Edsall T et al. Less is more: Trading a little bandwith for ultra-low latency in the data center. In Proc. the 9th USENIX Conference on Networked Systems Design and Implementation, April 2012, pp.19-33.
[9]
Mittal R, Lam V T, Dukkipati N et al. Timely: RTT-based congestion control for the datacenter. In Proc. the 2015 ACM Conference on Special Interest Group on Data Communication, August 2015, pp.537-550. DOI: 10.1145/2785956.2787510.
[10]
Perry J, Ousterhout A, Balakrishnan H et al. Fastpass: A centralized “zero-queue” datacenter network. In Proc. the 2014 ACM SIGCOMM Conference, August 2014, pp.307-318. DOI: 10.1145/2619239.2626309.
[11]

Lee C, Park C, Jang K et al. Accurate latency-based congestion feedback for datacenters. IEEE/ACM Trans. Networking, 2016, 25(1): 403-415. DOI: 10.1109/TNET.2016.2587286.

[12]
Perry J, Balakrishnan H, Shah D et al. Flowtune: Flowlet control for datacenter networks. In Proc. the 14th USENIX Conference on Networked Systems Design and Implementation, March 2017, pp.421-435.
[13]

Lee D, Golestani S J, Lee D. Prevention of deadlocks and livelocks in lossless, backpressured packet networks. IEEE/ACM Trans. Networking, 2003, 11(6): 923-934. DOI: 10.1109/TNET.2003.820434.

[14]
Mittal R, Shpiner A, Panda A et al. Revisiting network support for RDMA. In Proc. the 2018 Conference of the ACM Special Interest Group on Data Communication, August 2018, pp.313-326. DOI: 10.1145/3230543.3230557.
[15]
Zhang Y, Jiang J, Xu K et al. BDS: A centralized near-optimal overlay network for inter-datacenter data replication. In Proc. the 13th EuroSys Conference, April 2018, Article No. 10. DOI: 10.1145/3190508.3190519.
[16]
Kung H, Blackwell T, Chapman A. Credit-based ow control for ATM networks: Credit update protocol, adaptive credit allocation, and statistical multiplexing. In Proc. the Conference on Communications Architectures, Protocols and Applications, August 31-September 2, 1994, pp.101-114. DOI: 10.1145/190314.190324.
[17]
Yang X, Wetherall D, Anderson T. A DoS-limiting network architecture. In Proc. the 2005 ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, August 2005, pp.241-252. DOI: 10.1145/1080091.1080120.
[18]
Farrington N, Andreyev A. Facebook’s data center network architecture. In Proc. the 2013 Optical Interconnects Conference, May 2013, pp.49-50. DOI: 10.1109/OIC.2013.6552917.
[19]
Farrington N, Rubow E, Vahdat A. Data center switch architecture in the age of merchant silicon. In Proc. the 17th IEEE Symposium on High Performance Interconnects, August 2009, pp.93-102. DOI: 10.1109/HOTI.2009.11.
[20]
Wei Z, Dong D, Huang S et al. EC4: ECN and credit-reservation converged congestion control. In Proc. the 25th International Conference on Parallel and Distributed Systems, December 2019. DOI: 10.1109/IC-PADS47876.2019.00039.
[21]
Zhang J, Bai W, Chen K. Enabling ECN for datacenter networks with RTT variations. In Proc. the 15th International Conference on Emerging Networking Experiments and Technologies, December 2019, pp.233-245. DOI: 10.1145/3359989.3365426.
[22]
Chen L, Lingys J, Chen K et al. AuTO: Scaling deep reinforcement learning of the datacenter-scale automatic traffic optimization. In Proc. the 2018 Conference of the ACM Special Interest Group on Data Communication, August 2018, pp.191-205. 10.1145/3230543.3230551.
[23]
Alizadeh M, Yang S, Sharif M et al. pFabric: Minimal near-optimal datacenter transport. In Proc. the 2013 ACM SIGCOMM Conference, August 2013, pp.435-446. DOI: 10.1145/2486001.2486031.
[24]
Wei Z, Dong D, Huang S et al. Measuring the coexistence competitiveness of ECN- or RTT-based ExpressPass and TCP in data centers. In Proc. the 2019 IEEE Int. Conf. Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking, December 2019, pp.286-293. DOI: 10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00050.
[25]

Floyd S, Jacobson V. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Networking, 1993, 1(4): 397-413. DOI: 10.1109/90.251892.

[26]

Barkmo L S, Peterson L L. TCP Vegas: End to end congestion avoidance on a global Internet. IEEE Journal on Selected Areas in Communications, 1995, 13(8): 1465-1480. DOI: 10.1109/49.464716.

[27]
Jin C, Wei D X, Low S. FAST TCP: Motivation, architecture, algorithms, performance. In Proc. the 2004 IEEE INFOCOM, March 2004, pp.2490-2501. DOI: 10.1109/INFCOM.2004.1354670.
[28]
Venkataramani A, Kokku R, Dahlin M. TCP Nice: A mechanism for background transfers. In Proc. the 5th Symposium on Operating Systems Design and Implementation, December 2002, pp.329-343. DOI: 10.1145/1060289.1060320.
[29]
Kuzmanovic A, Knightly E. TCP-LP: A distributed algorithm for low priority data transfer. In Proc. the 2003 IEEE INFOCOM, March 30-April 3, 2003, pp.1691-1701. DOI: 10.1109/INFCOM.2003.1209192.
[30]
Jiang N, Becker D, Michelogiannakis G et al. Network congestion avoidance through speculative reservation. In Proc. IEEE International Symposium on High-Performance Comp Architecture, February 2012, pp.443-454. DOI: 10.1109/HPCA.2012.6169047.
[31]
Michelogiannakis G, Jiang N, Becker D et al. Channel reservation protocol for over-subscribed channels and destinations. In Proc. the International Conference on High Performance Computing, Networking, Storage and Analysis, November 2013, Article No. 52. DOI: 10.1145/2503210.2503213.
[32]
Nan J, Dennison L, Dally W. Network endpoint congestion control for fine-grained communication. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, November 2015, Article No. 35. DOI: 10.1145/2807591.2807600.
[33]
Zeng G, Bai W, Chen G et al. Congestion control for cross-datacenter networks. In Proc. the 27th International Conference on Network Protocols, October 2019. DOI: 10.1109/ICNP.2019.8888042.
[34]
Zeng G, Bai W, Chen G et al. Combining ECN and RTT for datacenter transport. In Proc. the 1st Asia-Pacific Workshop on Networking, August 2017, pp.36-42. DOI: 10.1145/3106989.3107002.
Journal of Computer Science and Technology
Pages 1071-1086
Cite this article:
Hu D-H, Dong D-Z, Bai Y, et al. Harmonia: Explicit Congestion Notification and Credit-Reservation Transport Converged Congestion Control in Datacenters. Journal of Computer Science and Technology, 2021, 36(5): 1071-1086. https://doi.org/10.1007/s11390-021-1243-x

393

Views

2

Crossref

0

Web of Science

3

Scopus

1

CSCD

Altmetrics

Received: 01 January 2021
Accepted: 25 August 2021
Published: 30 September 2021
© Institute of Computing Technology, Chinese Academy of Sciences 2021
Return