当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Framework for Low-Communication 1-D FFT
Scientific Programming Pub Date : 2013 , DOI: 10.3233/spr-130373
Ping Tak Peter Tang, Jongsoo Park, Daehyun Kim, Vladimir Petrov

In high-performance computing on distributed-memory systems, communication often represents a significant part of the overall execution time. The relative cost of communication will certainly continue to rise as compute-density growth follows the current technology and industry trends. Design of lower-communication alternatives to fundamental computational algorithms has become an important field of research. For distributed 1-D FFT, communication cost has hitherto remained high as all industry-standard implementations perform three all-to-all internode data exchanges (also called global transposes). These communication steps indeed dominate execution time. In this paper, we present a mathematical framework from which many single-all-to-all and easy-to-implement 1-D FFT algorithms can be derived. For large-scale problems, our implementation can be twice as fast as leading FFT libraries on state-of-the-art computer clusters. Moreover, our framework allows tradeoff between accuracy and performance, further boosting performance if reduced accuracy is acceptable.

中文翻译:

低通信一维FFT框架

在分布式内存系统上的高性能计算中,通信通常占整个执行时间的重要部分。随着计算密度的增长遵循当前的技术和行业趋势,通信的相对成本肯定会继续上升。低通信替代基本计算算法的设计已成为重要的研究领域。迄今为止,对于分布式一维FFT,由于所有行业标准实现都执行三个所有节点间数据交换(也称为全局转置),因此通信成本一直很高。这些通信步骤确实控制了执行时间。在本文中,我们提供了一个数学框架,从中可以导出许多单对所有和易于实现的一维FFT算法。对于大规模问题,我们的实现速度可以是最新计算机集群上领先的FFT库的两倍。此外,我们的框架允许在精度和性能之间进行权衡,如果可以接受降低的精度,则可以进一步提高性能。
更新日期:2020-09-25
down
wechat
bug