当前位置: X-MOL 学术SIAM J. Sci. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Holistic Algorithmic Approach to Improving Accuracy, Robustness, and Computational Efficiency for Atmospheric Dynamics
SIAM Journal on Scientific Computing ( IF 3.0 ) Pub Date : 2020-10-27 , DOI: 10.1137/19m128435x
Matthew Norman , Jeffrey Larkin

SIAM Journal on Scientific Computing, Volume 42, Issue 5, Page B1302-B1327, January 2020.
Atmospheric weather and climate models must perform simulations very quickly to be useful. Therefore, modelers have traditionally focused on reducing computations as much as possible. However, in our new era of increasingly compute-capable hardware, data movement is now the prohibiting expense. This study examines the computational benefits of a new algorithmic approach to modeling atmospheric dynamics on scales relevant to weather and climate simulation. Rather than minimizing computations, this new approach considers the larger problem more holistically, including spatial accuracy, temporal accuracy, robustness (i.e., oscillations), on-node efficiency, and internode data transfers together at once. Numerical experiments demonstrate how computations can be strategically increased to simultaneously address each of these constraints while reducing data movement to adapt to modern accelerated hardware. The new algorithm can achieve at times up to 80% peak floating point throughput in single precision on the Nvidia Tesla V100 GPU, where the traditional approach is shown to only achieve single-digit floating point efficiency. Further, the new algorithm is twice as fast as a standard Runge--Kutta time integrator, and high-order accuracy with Weighted Essentially Non-Oscillatory (WENO) limiting came at less than 30% additional runtime cost on a GPU, thus increasing the accuracy per degree of freedom.


中文翻译:

一种用于提高大气动力学精度,鲁棒性和计算效率的整体算法

SIAM科学计算杂志,第42卷,第5期,第B1302-B1327页,2020年1月。
大气天气和气候模型必须非常快速地执行模拟才能有用。因此,建模人员传统上一直专注于尽可能减少计算量。但是,在我们具有越来越多的计算能力的硬件的新时代,数据移动现在已成为无价之宝。这项研究探讨了一种新的算法方法的计算优势,该算法可以在与天气和气候模拟相关的尺度上对大气动力学进行建模。这种新方法不是最小化计算,而是更全面地考虑更大的问题,包括空间精度,时间精度,鲁棒性(即振荡),节点效率以及节点间数据传输一次。数值实验表明,如何在战略上增加计算量,以同时解决这些限制,同时减少数据移动,以适应现代加速硬件。新的算法可以在Nvidia Tesla V100 GPU上以单精度有时达到高达80%的峰值浮点吞吐量,而传统方法被证明只能达到一位数的浮点效率。此外,新算法的速度是标准Runge-Kutta时间积分器的两倍,并且具有加权基本非振荡(WENO)限制的高阶精度在GPU上的运行时成本不到30%,从而增加了每个自由度的准确性。在Nvidia Tesla V100 GPU上,新算法有时可以单精度实现高达80%的峰值浮点吞吐量,在传统的方法中,传统方法仅能达到一位数的浮点效率。此外,新算法的速度是标准Runge-Kutta时间积分器的两倍,并且具有加权基本非振荡(WENO)限制的高阶精度在GPU上的运行时成本不到30%,从而增加了每个自由度的准确性。新算法可以在Nvidia Tesla V100 GPU上以单精度有时达到高达80%的峰值浮点吞吐量,在传统的方法中,传统方法仅能达到一位数的浮点效率。此外,新算法的速度是标准Runge-Kutta时间积分器的两倍,并且具有加权基本非振荡(WENO)限制的高阶精度在GPU上的运行时成本不到30%,从而增加了每个自由度的准确性。
更新日期:2020-12-04
down
wechat
bug