当前位置: X-MOL 学术arXiv.cs.MS › 论文详情
Large-Scale Discrete Fourier Transform on TPUs
arXiv - CS - Mathematical Software Pub Date : 2020-02-09 , DOI: arxiv-2002.03260
Tianjian Lu; Yi-Fan Chen; Blake Hechtman; Tao Wang; John Anderson

In this work, we present a parallel algorithm for large-scale discrete Fourier transform (DFT) on Tensor Processing Unit (TPU) clusters. The algorithm is implemented in TensorFlow because of its rich set of functionalities for scientific computing and simplicity in realizing parallel computing algorithms. The DFT formulation is based on matrix multiplications between the input data and the Vandermonde matrix. This formulation takes full advantage of TPU's strength in matrix multiplications and allows nonuniformly sampled input data without modifying the implementation. For the parallel computing, both the input data and the Vandermonde matrix are partitioned and distributed across TPU cores. Through the data decomposition, the matrix multiplications are kept local within TPU cores and can be performed completely in parallel. The communication among TPU cores is achieved through the one-shuffle scheme, with which sending and receiving data takes place simultaneously between two neighboring cores and along the same direction on the interconnect network. The one-shuffle scheme is designed for the interconnect topology of TPU clusters, requiring minimal communication time among TPU cores. Numerical examples are used to demonstrate the high parallel efficiency of the large-scale DFT on TPUs.
更新日期:2020-02-11

 

全部期刊列表>>
智控未来
聚焦商业经济政治法律
跟Nature、Science文章学绘图
控制与机器人
招募海内外科研人才,上自然官网
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
x-mol收录
湖南大学化学化工学院刘松
上海有机所
廖良生
南方科技大学
西湖大学
伊利诺伊大学香槟分校
徐明华
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug