当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2022-10-17 , DOI: 10.1186/s13321-022-00652-1
Jong Youl Choi 1 , Pei Zhang 2 , Kshitij Mehta 1 , Andrew Blanchard 2 , Massimiliano Lupo Pasini 2
Affiliation  

Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch. We use ADIOS, a high-performance data management framework for efficient storage and reading of large molecular graph data. We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap. We measure the scalability, accuracy, and convergence of our approach on two DOE supercomputers: the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) and the Perlmutter system at the National Energy Research Scientific Computing Center (NERSC). We present our experimental results with HydraGNN showing (i) reduction of data loading time up to 4.2 times compared with a conventional method and (ii) linear scaling performance for training up to 1024 GPUs on both Summit and Perlmutter.

中文翻译:


图卷积神经网络的可扩展训练,用于快速准确地预测分子中的 HOMO-LUMO 间隙



图卷积神经网络 (GCNN) 是材料科学中一类流行的深度学习 (DL) 模型,用于根据分子结构的图表示来预测材料特性。训练用于分子设计的准确且全面的 GCNN 代理需要大规模的图形数据集,并且通常是一个耗时的过程。 GPU 和分布式计算的最新进展为有效降低 GCNN 训练的计算成本开辟了道路。然而,高效利用高性能计算(HPC)资源进行训练需要同时优化大规模数据管理和可扩展的随机批量优化技术。在这项工作中,我们专注于在 HPC 系统上构建 GCNN 模型来预测数百万个分子的材料特性。我们使用 HydraGNN(我们的内部库)进行大规模 GCNN 训练,并利用 PyTorch 中的分布式数据并行性。我们使用ADIOS,一个高性能数据管理框架,用于高效存储和读取大型分子图数据。我们对两个开源大规模图数据集进行并行训练,为称为 HOMO-LUMO 间隙的重要量子属性构建 GCNN 预测器。我们在两台美国能源部超级计算机上测量了我们的方法的可扩展性、准确性和收敛性:橡树岭领导计算设施 (OLCF) 的 Summit 超级计算机和国家能源研究科学计算中心 (NERSC) 的 Perlmutter 系统。我们展示了 HydraGNN 的实验结果,显示 (i) 与传统方法相比,数据加载时间减少了 4.2 倍;(ii) 在 Summit 和 Perlmutter 上训练多达 1024 个 GPU 时具有线性扩展性能。
更新日期:2022-10-17
down
wechat
bug