当前位置: X-MOL 学术J. Chem. Phys. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Techniques for high-performance construction of Fock matrices.
The Journal of Chemical Physics ( IF 3.1 ) Pub Date : 2020-01-14 , DOI: 10.1063/1.5129452
Hua Huang 1 , C David Sherrill 2 , Edmond Chow 1
Affiliation  

This paper presents techniques for Fock matrix construction that are designed for high performance on shared and distributed memory parallel computers when using Gaussian basis sets. Four main techniques are considered. (1) To calculate electron repulsion integrals, we demonstrate batching together the calculation of multiple shell quartets of the same angular momentum class so that the calculation of large sets of primitive integrals can be efficiently vectorized. (2) For multithreaded summation of entries into the Fock matrix, we investigate using a combination of atomic operations and thread-local copies of the Fock matrix. (3) For distributed memory parallel computers, we present a globally accessible matrix class for accessing distributed Fock and density matrices. The new matrix class introduces a batched mode for remote memory access that can reduce the synchronization cost. (4) For density fitting, we exploit both symmetry (of the Coulomb and exchange matrices) and sparsity (of 3-index tensors) and give a performance comparison of density fitting and the conventional direct calculation approach. The techniques are implemented in an open-source software library called GTFock.

中文翻译:

Fock矩阵的高性能构造技术。

本文介绍了Fock矩阵构造技术,这些技术旨在在使用高斯基集的共享和分布式内存并行计算机上实现高性能。考虑了四种主要技术。(1)为了计算电子排斥积分,我们演示了将具有相同角动量类别的多个壳四重奏的计算分批处理,以便可以有效地向量化大量原始积分。(2)对于Fock矩阵中条目的多线程求和,我们研究了原子操作和Fock矩阵的线程局部副本的组合。(3)对于分布式内存并行计算机,我们提出了一个全局可访问的矩阵类,用于访问分布式Fock和密度矩阵。新的矩阵类引入了用于远程内存访问的批处理模式,可以降低同步成本。(4)对于密度拟合,我们同时利用了对称性(库仑矩阵和交换矩阵)和稀疏性(3指数张量),并给出了密度拟合与常规直接计算方法的性能比较。这些技术在名为GTFock的开源软件库中实现。
更新日期:2020-01-14
down
wechat
bug