当前位置: X-MOL 学术arXiv.cs.DS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On Symmetric Rectilinear Matrix Partitioning
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-09-16 , DOI: arxiv-2009.07735
Abdurrahman Ya\c{s}ar and Muhammed Fat\.ih Balin and Xiaojing An and Kaan Sancak and \"Umit V. \c{C}ataly\"urek

Even distribution of irregular workload to processing units is crucial for efficient parallelization in many applications. In this work, we are concerned with a spatial partitioning called rectilinear partitioning (also known as generalized block distribution) of sparse matrices. More specifically, in this work, we address the problem of symmetric rectilinear partitioning of a square matrix. By symmetric, we mean the rows and columns of the matrix are identically partitioned yielding a tiling where the diagonal tiles (blocks) will be squares. We first show that the optimal solution to this problem is NP-hard, and we propose four heuristics to solve two different variants of this problem. We present a thorough analysis of the computational complexities of those proposed heuristics. To make the proposed techniques more applicable in real life application scenarios, we further reduce their computational complexities by utilizing effective sparsification strategies together with an efficient sparse prefix-sum data structure. We experimentally show the proposed algorithms are efficient and effective on more than six hundred test matrices. With sparsification, our methods take less than 3 seconds in the Twitter graph on a modern 24 core system and output a solution whose load imbalance is no worse than 1%.

中文翻译:

关于对称直线矩阵划分

甚至将不规则工作负载分配到处理单元对于许多应用程序中的高效并行化至关重要。在这项工作中,我们关注稀疏矩阵的称为直线分区(也称为广义块分布)的空间分区。更具体地说,在这项工作中,我们解决了方阵对称直线划分的问题。通过对称,我们的意思是矩阵的行和列被相同地分区,产生一个平铺,其中对角线平铺(块)将是正方形。我们首先表明这个问题的最佳解决方案是 NP-hard,我们提出了四种启发式方法来解决这个问题的两个不同变体。我们对这些提议的启发式算法的计算复杂性进行了彻底的分析。为了使所提出的技术更适用于现实生活中的应用场景,我们通过利用有效的稀疏化策略以及高效的稀疏前缀和数据结构来进一步降低它们的计算复杂度。我们通过实验表明,所提出的算法在六百多个测试矩阵上是有效的。通过稀疏化,我们的方法在现代 24 核系统上的 Twitter 图中花费不到 3 秒的时间,并输出负载不平衡不超过 1% 的解决方案。
更新日期:2020-09-17
down
wechat
bug