

浏览全部资源
扫码关注微信
1.中南大学计算机学院,湖南长沙 410083
2.长沙理工大学计算机与通信工程学院,湖南长沙 410114
Received:05 May 2022,
Revised:2022-07-06,
Published:25 September 2023
移动端阅览
胡晋彬,黄家玮,王建新等.基于直接拥塞通告的数据中心无损网络传输控制机制[J].电子学报,2023,51(09):2355-2365.
HU Jin-bin,HUANG Jia-wei,WANG Jian-xin,et al.A Transmission Control Mechanism for Lossless Datacenter Network Based on Direct Congestion Notification[J].ACTA ELECTRONICA SINICA,2023,51(09):2355-2365.
胡晋彬,黄家玮,王建新等.基于直接拥塞通告的数据中心无损网络传输控制机制[J].电子学报,2023,51(09):2355-2365. DOI: 10.12263/DZXB.20220491.
HU Jin-bin,HUANG Jia-wei,WANG Jian-xin,et al.A Transmission Control Mechanism for Lossless Datacenter Network Based on Direct Congestion Notification[J].ACTA ELECTRONICA SINICA,2023,51(09):2355-2365. DOI: 10.12263/DZXB.20220491.
数据中心网络广泛采用基于优先级的流量控制(Priority-based Flow Control, PFC)机制来避免因缓存溢出而丢包.然而,PFC机制在保证无损传输的同时带来了队头阻塞和拥塞扩散等负面影响.近年来,一些具备端到端拥塞感知能力的传输控制协议被提出来,有效缓解了网络拥塞,减少了PFC的触发.但是在突发流量造成的瞬时拥塞场景下,这些研究工作仍会使得PFC频繁触发而导致严重的队头阻塞和拥塞扩散.针对该问题,在端到端拥塞控制基础上,提出了一种实现于交换机上的直接拥塞通告解决方案(Direct COngestion Notification, DCON),该方案在突发拥塞场景下能及时识别出与非拥塞流(与造成拥塞无关的流)共享入端口的拥塞流(真正造成拥塞的流),并从交换机直接通告发送端对该拥塞流精确地降速.实验结果表明,相比于现有的端到端拥塞控制传输协议,DCON有效避免了PFC的队头阻塞和拥塞扩散,平均流完成时间的最大降幅达到55%.
Priority-based flow control (PFC) mechanism is widely deployed in data center network to avoid packet loss due to buffer overflow. Although PFC mechanism guarantees lossless transmission
it brings negative impacts such as head-of-line blocking and congestion spreading
etc. In recent years
many end-to-end congestion aware transport protocols have been proposed to effectively alleviate network congestion and reduce the triggering of PFC. However
in the case of transient congestion due to burst traffic
PFC is still triggered frequently even if the above end-to-end transport protocols are deployed
resulting in serious head-of-line blocking and congestion spreading. Therefore
on the basis of end-to-end congestion control
this paper proposes a direct congestion notification (DCON) solution implemented on the switches. DCON can timely identify the congested flows (really responsible for congestion) sharing the ingress port with the non-congested flows (not responsible for congestion). Meanwhile
DCON directly sends the congestion notification message to the corresponding senders from the switch and accurately sets the target rate for the identified congested flows at the sender. Compared to the existing end-to-end transmission control protocols
the experimental results show that DCON effectively avoids the head-of-line blocking and congestion spreading of PFC
and reduces the average flow completion time by up to 55%.
ZHU Y B , ERAN H , FIRESTONE D , et al . Congestion control for large-scale RDMA deployments [J]. ACM SIGCOMM Computer Communication Review , 2015 , 45 ( 4 ): 523 - 536 .
GUO C X , WU H T , DENG Z , et al . RDMA over commodity Ethernet at scale [C]// Proceedings of the 2016 ACM SIGCOMM Conference . New York : ACM , 2016 : 202 - 215 .
CHENG W X , QIAN K , JIANG W C , et al . Re-architecting congestion management in lossless Ethernet [C]// Proceedings of the 17th Usenix Conference on Networked Systems Design and Implementation . Santa Clara : USENIX Association , 2020 : 19 - 36 .
ZHANG Y R , LIU Y F , MENG Q K , et al . Congestion detection in lossless networks [C]// Proceedings of the 2021 ACM SIGCOMM 2021 Conference . New York : ACM , 2021 : 370 - 383 .
杜鑫乐 , 徐恪 , 李彤 , 等 . 数据中心网络的流量控制: 研究现状与趋势 [J]. 计算机学报 , 2021 , 44 ( 7 ): 1287 - 1309 .
DU X L , XU K , LI T , et al . Traffic control for data center network: State of the art and future research [J]. Chinese Journal of Computers , 2021 , 44 ( 7 ): 1287 - 1309 . (in Chinese)
李丹 , 陈贵海 , 任丰原 , 等 . 数据中心网络的研究进展与趋势 [J]. 计算机学报 , 2014 , 37 ( 2 ): 259 - 274 .
LI D , CHEN G H , REN F Y , et al . Data center network research progress and trends [J]. Chinese Journal of Computers , 2014 , 37 ( 2 ): 259 - 274 . (in Chinese)
王娟 , 夏羽 . TCP SkyLine: 数据中心网络高吞吐率传输 [J]. 电子学报 , 2020 , 48 ( 12 ): 2425 - 2433 .
WANG J , XIA Y . TCP SkyLine: A high-throughput transport for data center networks [J]. Acta Electronica Sinica , 2020 , 48 ( 12 ): 2425 - 2433 . (in Chinese)
崔子熙 , 胡宇翔 , 兰巨龙 , 等 . 基于流分类的数据中心网络负载均衡机制 [J]. 电子学报 , 2021 , 49 ( 3 ): 559 - 565 .
CUI Z X , HU Y X , LAN J L , et al . Load balancing based on flow classification for datacenter network [J]. Acta Electronica Sinica , 2021 , 49 ( 3 ): 559 - 565 . (in Chinese)
臧韦菲 , 兰巨龙 , 胡宇翔 . 基于松弛时间与累计发送量的数据中心网络混合流调度机制 [J]. 电子学报 , 2019 , 47 ( 10 ): 2061 - 2068 .
ZANG W F , LAN J L , HU Y X . Slack time and accumulation-based mix-flow scheduling in data center networks [J]. Acta Electronica Sinica , 2019 , 47 ( 10 ): 2061 - 2068 . (in Chinese)
林智华 , 高文 , 吴春明 , 等 . 基于离散粒子群算法的数据中心网络流量调度研究 [J]. 电子学报 , 2016 , 44 ( 9 ): 2197 - 2202 .
LIN Z H , GAO W , WU C M , et al . Data center network flow scheduling based on DPSO algorithm [J]. Acta Electronica Sinica , 2016 , 44 ( 9 ): 2197 - 2202 . (in Chinese)
LU Y W , CHEN G , LI B J , et al . Multi-path transport for rdma in datacenters [C]// Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation . Renton : USENIX Association , 2018 : 357 - 371 .
李文信 , 齐恒 , 徐仁海 , 等 . 数据中心网络流量调度的研究进展与趋势 [J]. 计算机学报 , 2020 , 43 ( 4 ): 600 - 617 .
LI W X , QI H , XU R H , et al . Data center network flow scheduling progress and trends [J]. Chinese Journal of Computers , 2020 , 43 ( 4 ): 600 - 617 . (in Chinese)
GUO Z H , LIU S , ZHANG Z L . Traffic control for RDMA-enabled data center networks: A survey [J]. IEEE Systems Journal , 2020 , 14 ( 1 ): 677 - 688 .
曾高雄 , 胡水海 , 张骏雪 , 等 . 数据中心网络传输协议综述 [J]. 计算机研究与发展 , 2020 , 57 ( 1 ): 74 - 84 .
ZENG G X , HU S H , ZHANG J X , et al . Transport protocols for data center networks: A survey [J]. Journal of Computer Research and Development , 2020 , 57 ( 1 ): 74 - 84 . (in Chinese)
邓罡 , 龚正虎 , 王宏 . 现代数据中心网络特征研究 [J]. 计算机研究与发展 , 2014 , 51 ( 2 ): 395 - 407 .
DENG G , GONG Z H , WANG H . Characteristics research on modern data center network [J]. Journal of Computer Research and Development , 2014 , 51 ( 2 ): 395 - 407 . (in Chinese)
GAO Y X , YANG Y C , CHEN T , et al . DCQCN: Taming large-scale incast congestion in RDMA over Ethernet networks [C]// 2018 IEEE 26th International Conference on Network Protocols (ICNP) . Piscataway : IEEE , 2018 : 110 - 120 .
TIAN C , LI B , QIN L L , et al . P-PFC: Reducing tail latency with predictive PFC in lossless data center networks [J]. IEEE Transactions on Parallel and Distributed Systems , 2020 , 31 ( 6 ): 1447 - 1459 .
HU S H , ZHU Y B , CHENG P , et al . Tagger: Practical PFC deadlock prevention in data center networks [C]// Proceedings of the 13th International Conference on emerging Networking EXperiments and Technologies . Incheon : ACM , 2017 : 451 - 463 .
QIAN K , CHENG W X , ZHANG T , et al . Gentle flow control: Avoiding deadlock in lossless networks [C]// Proceedings of the ACM Special Interest Group on Data Communication . New York : ACM , 2019 : 75 - 89 .
XUE J C , CHAUDHRY M U , VAMANAN B , et al . Dart: Divide and specialize for fast response to congestion in RDMA-based datacenter networks [J]. IEEE/ACM Transactions on Networking , 2020 , 28 ( 1 ): 322 - 335 .
MITTAL R , LAM V T , DUKKIPATI N , et al . TIMELY: RTT-based congestion control for the datacenter [J]. ACM SIGCOMM Computer Communication Review , 2015 , 45 ( 4 ): 537 - 550 .
KUMAR G , DUKKIPATI N , JANG K , et al . Swift: Delay is simple and effective for congestion control in the datacenter [C]// Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication . New York : ACM , 2020 : 514 - 528 .
IEEE . 802 . 1 Qau—Congestion Notification [S/OL]. [2021-10-14] . http://www.ieee802.org/1/pages/802.1au.html http://www.ieee802.org/1/pages/802.1au.html .
ZHANG Y , ANSARI N . Fair quantized congestion notification in data center networks [J]. IEEE Transactions on Communications , 2013 , 61 ( 11 ): 4690 - 4699 .
LI Y L , MIAO R , LIU H H , et al . HPCC: High precision congestion control [C]// Proceedings of the ACM Special Interest Group on Data Communication . New York : ACM , 2019 : 44 - 58 .
HU J B , HUANG J W , LV W J , et al . CAPS: Coding-based adaptive packet spraying to reduce flow completion time in data center [J]. IEEE/ACM Transactions on Networking , 2019 , 27 ( 6 ): 2338 - 2353 .
HU S H , BAI W , ZENG G X , et al . Aeolus: A building block for proactive transport in datacenters [C]// Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication . New York : ACM , 2020 : 422 - 434 .
ROY A , ZENG H Y , BAGGA J , et al . Inside the social network's (datacenter) network [C]// Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication . New York : ACM , 2015 : 123 - 137 .
0
Views
10
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621