Leveraging GPU batching for scalable nonlinear programming through massive Lagrangian decomposition,arXiv - CS - Mathematical Software

当前位置： X-MOL 学术 › arXiv.cs.MS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Leveraging GPU batching for scalable nonlinear programming through massive Lagrangian decomposition
arXiv - CS - Mathematical Software Pub Date : 2021-06-28 , DOI: arxiv-2106.14995
Youngdae Kim, François Pacaud, Kibaek Kim, Mihai Anitescu

We present the implementation of a trust-region Newton algorithm ExaTron for bound-constrained nonlinear programming problems, fully running on multiple GPUs. Without data transfers between CPU and GPU, our implementation has achieved the elimination of a major performance bottleneck under a memory-bound situation, particularly when solving many small problems in batch. We discuss the design principles and implementation details for our kernel function and core operations. Different design choices are justified by numerical experiments. By using the application of distributed control of alternating current optimal power flow, where a large problem is decomposed into many smaller nonlinear programs using a Lagrangian approach, we demonstrate computational performance of ExaTron on the Summit supercomputer at Oak RidgeNational Laboratory. Our numerical results show the linear scaling with respect to the batch size and the number of GPUs and more than 35 times speedup on 6 GPUs than on 40 CPUs available on a single node.

中文翻译：

通过大规模拉格朗日分解利用 GPU 批处理进行可扩展的非线性编程

我们提出了一个信任域牛顿算法 ExaTron 的实现，用于约束约束非线性编程问题，完全在多个 GPU 上运行。在没有 CPU 和 GPU 之间的数据传输的情况下，我们的实现实现了消除内存受限情况下的主要性能瓶颈，尤其是在批量解决许多小问题时。我们讨论内核函数和核心操作的设计原则和实现细节。数值实验证明了不同的设计选择是合理的。通过使用交流最佳潮流的分布式控制的应用，使用拉格朗日方法将一个大问题分解为许多较小的非线性程序，我们在橡树岭国家实验室的 Summit 超级计算机上展示了 ExaTron 的计算性能。

更新日期：2021-06-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文