当前位置: X-MOL 学术arXiv.cs.PF › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU
arXiv - CS - Performance Pub Date : 2020-07-09 , DOI: arxiv-2007.04868
Filippo Mantovani, Marta Garcia-Gasulla, Jos\'e Gracia, Esteban Stafford, Fabio Banchelli, Marc Josep-Fabrego, Joel Criado-Ledesma, Mathias Nachtmann

In this paper, we analyze the performance and energy consumption of an Arm-based high-performance computing (HPC) system developed within the European project Mont-Blanc 3. This system, called Dibona, has been integrated by ATOS/Bull, and it is powered by the latest Marvell's CPU, ThunderX2. This CPU is the same one that powers the Astra supercomputer, the first Arm-based supercomputer entering the Top500 in November 2018. We study from micro-benchmarks up to large production codes. We include an interdisciplinary evaluation of three scientific applications (a finite-element fluid dynamics code, a smoothed particle hydrodynamics code, and a lattice Boltzmann code) and the Graph 500 benchmark, focusing on parallel and energy efficiency as well as studying their scalability up to thousands of Armv8 cores. For comparison, we run the same tests on state-of-the-art x86 nodes included in Dibona and the Tier-0 supercomputer MareNostrum4. Our experiments show that the ThunderX2 has a 25% lower performance on average, mainly due to its small vector unit yet somewhat compensated by its 30% wider links between the CPU and the main memory. We found that the software ecosystem of the Armv8 architecture is comparable to the one available for Intel. Our results also show that ThunderX2 delivers similar or better energy-to-solution and scalability, proving that Arm-based chips are legitimate contenders in the market of next-generation HPC systems.

中文翻译:

基于 Arm ThunderX2 CPU 的集群上 HPC 工作负载的性能和能耗

在本文中,我们分析了在欧洲项目 Mont-Blanc 3 中开发的基于 Arm 的高性能计算 (HPC) 系统的性能和能耗。该系统称为 Dibona,已由 ATOS/Bull 集成,并且由最新的 Marvell CPU ThunderX2 驱动。该 CPU 与为 Astra 超级计算机提供动力的 CPU 相同,这是第一台于 2018 年 11 月进入 Top500 的基于 Arm 的超级计算机。我们从微基准测试到大型生产代码。我们包括对三个科学应用(有限元流体动力学代码、平滑粒子流体动力学代码和格子 Boltzmann 代码)和 Graph 500 基准的跨学科评估,重点关注并行和能源效率,并研究它们的可扩展性数以千计的 Armv8 内核。为了比较,我们在 Dibona 和 Tier-0 超级计算机 MareNostrum4 中包含的最先进的 x86 节点上运行相同的测试。我们的实验表明,ThunderX2 的平均性能降低了 25%,这主要是因为它的向量单元很小,但在某种程度上被 CPU 和主内存之间 30% 的更宽的链接所补偿。我们发现 Armv8 架构的软件生态系统可与 Intel 的软件生态系统相媲美。我们的结果还表明,ThunderX2 提供类似或更好的解决方案能源和可扩展性,证明基于 Arm 的芯片是下一代 HPC 系统市场的合法竞争者。主要是因为它的向量单元很小,但在某种程度上可以通过 CPU 和主存储器之间 30% 的更宽的链接来补偿。我们发现 Armv8 架构的软件生态系统可与 Intel 的软件生态系统相媲美。我们的结果还表明,ThunderX2 提供类似或更好的解决方案能源和可扩展性,证明基于 Arm 的芯片是下一代 HPC 系统市场的合法竞争者。主要是因为它的向量单元很小,但在某种程度上可以通过 CPU 和主存储器之间 30% 的更宽的链接来补偿。我们发现 Armv8 架构的软件生态系统可与 Intel 的软件生态系统相媲美。我们的结果还表明,ThunderX2 提供类似或更好的解决方案能源和可扩展性,证明基于 Arm 的芯片是下一代 HPC 系统市场的合法竞争者。
更新日期:2020-07-13
down
wechat
bug