当前位置: X-MOL 学术arXiv.cs.PF › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Vertical, Temporal, and Horizontal Scaling of Hierarchical Hypersparse GraphBLAS Matrices
arXiv - CS - Performance Pub Date : 2021-08-15 , DOI: arxiv-2108.06650
Jeremy Kepner, Tim Davis, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Matthew Hubbell, Michael Houle, Michael Jones, Anna Klein, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Peter Michaleas

Hypersparse matrices are a powerful enabler for a variety of network, health, finance, and social applications. Hierarchical hypersparse GraphBLAS matrices enable rapid streaming updates while preserving algebraic analytic power and convenience. In many contexts, the rate of these updates sets the bounds on performance. This paper explores hierarchical hypersparse update performance on a variety of hardware with identical software configurations. The high-level language bindings of the GraphBLAS readily enable performance experiments on simultaneous diverse hardware. The best single process performance measured was 4,000,000 updates per second. The best single node performance measured was 170,000,000 updates per second. The hardware used spans nearly a decade and allows a direct comparison of hardware improvements for this computation over this time range; showing a 2x increase in single-core performance, a 3x increase in single process performance, and a 5x increase in single node performance. Running on nearly 2,000 MIT SuperCloud nodes simultaneously achieved a sustained update rate of over 200,000,000,000 updates per second. Hierarchical hypersparse GraphBLAS allows the MIT SuperCloud to analyze extremely large streaming network data sets.

中文翻译:

分层超稀疏 GraphBLAS 矩阵的垂直、时间和水平缩放

超稀疏矩阵是各种网络、健康、金融和社交应用程序的强大推动者。分层超稀疏 GraphBLAS 矩阵支持快速流更新,同时保留代数分析能力和便利性。在许多情况下,这些更新的速率设定了性能的界限。本文探讨了具有相同软件配置的各种硬件上的分层超稀疏更新性能。GraphBLAS 的高级语言绑定可以轻松地在同时进行的不同硬件上进行性能实验。测得的最佳单进程性能是每秒 4,000,000 次更新。测得的最佳单节点性能是每秒 170,000,000 次更新。所使用的硬件跨越近十年,并允许直接比较此时间范围内此计算的硬件改进;显示单核性能提高 2 倍,单进程性能提高 3 倍,单节点性能提高 5 倍。同时运行在近 2,000 个 MIT SuperCloud 节点上,实现了每秒超过 200,000,000,000 次更新的持续更新率。分层超稀疏 GraphBLAS 允许 MIT SuperCloud 分析极大的流网络数据集。
更新日期:2021-08-17
down
wechat
bug