当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
75,000,000,000 Streaming Inserts/Second Using Hierarchical Hypersparse GraphBLAS Matrices
arXiv - CS - Databases Pub Date : 2020-01-20 , DOI: arxiv-2001.06935
Jeremy Kepner, Tim Davis, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Matthew Hubbell, Michael Houle, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of hypersparse matrices put enormous pressure on the memory hierarchy. This work benchmarks an implementation of hierarchical hypersparse matrices that reduces memory pressure and dramatically increases the update rate into a hypersparse matrices. The parameters of hierarchical hypersparse matrices rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical hypersparse matrices achieve over 1,000,000 updates per second in a single instance. Scaling to 31,000 instances of hierarchical hypersparse matrices arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 75,000,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.

中文翻译:

75,000,000,000 次流式插入/秒使用分层超稀疏 GraphBLAS 矩阵

SuiteSparse GraphBLAS C 库实现了高性能超稀疏矩阵,并绑定到多种语言(Python、Julia 和 Matlab/Octave)。GraphBLAS 提供了超稀疏矩阵的轻量级内存数据库实现,非常适合分析多种类型的网络数据,同时提供严格的数学保证,例如线性。超稀疏矩阵的流式更新给内存层次结构带来了巨大的压力。这项工作对分层超稀疏矩阵的实现进行了基准测试,该实施降低了内存压力并显着提高了超稀疏矩阵的更新率。层次超稀疏矩阵的参数依赖于在级联更新之前控制层次结构中每个级别的条目数。这些参数很容易调整,以实现各种应用的最佳性能。分层超稀疏矩阵在单个实例中每秒可实现超过 1,000,000 次更新。在 MIT SuperCloud 上的 1,100 个服务器节点上扩展到 31,000 个分层超稀疏矩阵阵列实例,实现了每秒 75,000,000,000 次更新的持续更新率。此功能允许 MIT SuperCloud 分析极大的流网络数据集。
更新日期:2020-08-04
down
wechat
bug