当前位置: X-MOL 学术arXiv.cs.MS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic
arXiv - CS - Mathematical Software Pub Date : 2020-07-13 , DOI: arxiv-2007.06674
Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin Carson, Terry Cojean, Jack Dongarra, Mark Gates, Thomas Gr\"utzmacher, Nicholas J. Higham, Sherry Li, Neil Lindquist, Yang Liu, Jennifer Loe, Piotr Luszczek, Pratik Nayak, Sri Pranesh, Siva Rajamanickam, Tobias Ribizel, Barry Smith, Kasia Swirydowicz, Stephen Thomas, Stanimire Tomov, Yaohung M. Tsai, Ichitaro Yamazaki, Urike Meier Yang

Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the server-line products are increasingly featuring low-precision special function units, such as the NVIDIA tensor cores in ORNL's Summit supercomputer providing more than an order of magnitude higher performance than what is available in IEEE double precision. At the same time, the gap between the compute power on the one hand and the memory bandwidth on the other hand keeps increasing, making data access and communication prohibitively expensive compared to arithmetic operations. To start the multiprecision focus effort, we survey the numerical linear algebra community and summarize all existing multiprecision knowledge, expertise, and software capabilities in this landscape analysis report. We also include current efforts and preliminary results that may not yet be considered "mature technology," but have the potential to grow into production quality within the multiprecision focus effort. As we expect the reader to be familiar with the basics of numerical linear algebra, we refrain from providing a detailed background on the algorithms themselves but focus on how mixed- and multiprecision technology can help improving the performance of these methods and present highlights of application significantly outperforming the traditional fixed precision methods.



在过去几年中,硬件供应商已经开始设计低精度特殊功能单元,以响应机器学习社区的需求以及他们对低精度格式的高计算能力的需求。此外,服务器系列产品越来越多地采用低精度特殊功能单元,例如 ORNL 的 Summit 超级计算机中的 NVIDIA 张量内核,其性能比 IEEE 双精度中可用的性能高一个数量级以上。与此同时,一方面的计算能力与另一方面的内存带宽之间的差距不断扩大,与算术运算相比,数据访问和通信成本高得令人望而却步。要开始多精度对焦工作,我们调查了数值线性代数社区,并在这份景观分析报告中总结了所有现有的多精度知识、专业知识和软件功能。我们还包括可能尚未被视为“成熟技术”但有可能在多精度聚焦工作中提高生产质量的当前工作和初步结果。由于我们希望读者熟悉数值线性代数的基础知识,因此我们不提供算法本身的详细背景,而是关注混合和多精度技术如何帮助提高这些方法的性能并显着展示应用的亮点优于传统的固定精度方法。我们还包括可能尚未被视为“成熟技术”但有可能在多精度聚焦工作中提高生产质量的当前工作和初步结果。由于我们希望读者熟悉数值线性代数的基础知识,因此我们不提供算法本身的详细背景,而是关注混合和多精度技术如何帮助提高这些方法的性能并显着展示应用的亮点优于传统的固定精度方法。我们还包括可能尚未被视为“成熟技术”但有可能在多精度聚焦工作中提高生产质量的当前工作和初步结果。由于我们希望读者熟悉数值线性代数的基础知识,因此我们不提供算法本身的详细背景,而是关注混合和多精度技术如何帮助提高这些方法的性能并显着展示应用的亮点优于传统的固定精度方法。