当前位置: X-MOL 学术ACM Trans. Math. Softw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Yet Another Tensor Toolbox for Discontinuous Galerkin Methods and Other Applications
ACM Transactions on Mathematical Software ( IF 2.7 ) Pub Date : 2020-10-16 , DOI: 10.1145/3406835
Carsten Uphoff 1 , Michael Bader 1
Affiliation  

The numerical solution of partial differential equations is at the heart of many grand challenges in supercomputing. Solvers based on high-order discontinuous Galerkin (DG) discretisation have been shown to scale on large supercomputers with excellent performance and efficiency if the implementation exploits all levels of parallelism and is tailored to the specific architecture. However, every year new supercomputers emerge and the list of hardware-specific considerations grows simultaneously with the list of desired features in a DG code. Thus, we believe that a sustainable DG code needs an abstraction layer to implement the numerical scheme in a suitable language. We explore the possibility to abstract the numerical scheme as small tensor operations, describe them in a domain-specific language (DSL) resembling the Einstein notation, and to map them to small General Matrix-Matrix Multiplication routines. The compiler for our DSL implements classic optimisations that are used for large tensor contractions, and we present novel optimisation techniques such as equivalent sparsity patterns and optimal index permutations for temporary tensors. Our application examples, which include the earthquake simulation software SeisSol, show that the generated kernels achieve over 50% peak performance of a recent 48-core Skylake system while the DSL considerably simplifies the implementation.

中文翻译:

另一个用于不连续 Galerkin 方法和其他应用的张量工具箱

偏微分方程的数值解是超级计算中许多重大挑战的核心。如果实现利用所有级别的并行性并针对特定架构进行定制,则基于高阶不连续 Galerkin (DG) 离散化的求解器已被证明可以在具有出色性能和效率的大型超级计算机上进行扩展。然而,每年都会出现新的超级计算机,并且特定于硬件的考虑因素列表与 DG 代码中所需功能的列表同时增长。因此,我们认为可持续的 DG 代码需要一个抽象层来以合适的语言实现数值方案。我们探索将数值方案抽象为小张量运算的可能性,用类似于爱因斯坦符号的领域特定语言 (DSL) 描述它们,并将它们映射到小型通用矩阵-矩阵乘法例程。我们的 DSL 的编译器实现了用于大张量收缩的经典优化,我们提出了新颖的优化技术,例如等效稀疏模式和临时张量的最佳索引排列。我们的应用示例(包括地震模拟软件 SeisSol)表明,生成的内核实现了最近 48 核 Skylake 系统的 50% 以上的峰值性能,而 DSL 大大简化了实现。
更新日期:2020-10-16
down
wechat
bug