当前位置: X-MOL 学术arXiv.cs.PF › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An In-Depth Analysis of the Slingshot Interconnect
arXiv - CS - Performance Pub Date : 2020-08-20 , DOI: arxiv-2008.08886
Daniele De Sensi, Salvatore Di Girolamo, Kim H. McMahon, Duncan Roweth, Torsten Hoefler

The interconnect is one of the most critical components in large scale computing systems, and its impact on the performance of applications is going to increase with the system size. In this paper, we will describe Slingshot, an interconnection network for large scale computing systems. Slingshot is based on high-radix switches, which allow building exascale and hyperscale datacenters networks with at most three switch-to-switch hops. Moreover, Slingshot provides efficient adaptive routing and congestion control algorithms, and highly tunable traffic classes. Slingshot uses an optimized Ethernet protocol, which allows it to be interoperable with standard Ethernet devices while providing high performance to HPC applications. We analyze the extent to which Slingshot provides these features, evaluating it on microbenchmarks and on several applications from the datacenter and AI worlds, as well as on HPC applications. We find that applications running on Slingshot are less affected by congestion compared to previous generation networks.

中文翻译:

弹弓互连的深入分析

互连是大规模计算系统中最关键的组件之一,它对应用程序性能的影响将随着系统规模的增加而增加。在本文中,我们将描述 Slingshot,一种用于大规模计算系统的互连网络。Slingshot 基于高基数交换机,允许构建最多具有三个交换机到交换机跃点的百亿亿级和超大规模数据中心网络。此外,Slingshot 提供高效的自适应路由和拥塞控制算法,以及高度可调的流量类别。Slingshot 使用优化的以太网协议,使其能够与标准以太网设备互操作,同时为 HPC 应用程序提供高性能。我们分析了 Slingshot 提供这些功能的程度,在微基准测试、数据中心和人工智能领域的多个应用程序以及 HPC 应用程序上对其进行评估。我们发现,与上一代网络相比,在 Slingshot 上运行的应用程序受拥塞的影响较小。
更新日期:2020-08-21
down
wechat
bug