当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient AES implementation on Sunway TaihuLight supercomputer: A systematic approach
Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2020-01-02 , DOI: 10.1016/j.jpdc.2019.12.013
Liandeng Li , Jiarui Fang , Jinlei Jiang , Lin Gan , Weijie Zheng , Haohuan Fu , Guangwen Yang

Encryption is an important technique to improve information security for many real-world applications. The Advanced Encryption Standard (AES) is a widely-used efficient cryptographic algorithm. Although AES is fast both in software and hardware, it is time-consuming to do data encryption especially for large amount of data. Therefore, it is a lasting effort to accelerate AES operations. This paper presents SW-AES, a parallel AES implementation on the Sunway TaihuLight, one of the fastest supercomputers in the world that takes the SW26010 processor as the basic building block. According to the architectural features of SW26010, SW-AES exploits parallelism from different levels, including (1) inter-CPE (Computing Processing Element) data parallelism that distributes tasks among the 256 on-chip CPEs, (2) intra-CPE data parallelism enabled by the Single-Instruction Multiple-Data (SIMD) instructions inside each CPE, and (3) instruction-level parallelism that pipelines memory access and the computation. In addition, corresponding to the two application scenarios, SW-AES presents scalable ways to efficiently run AES on many nodes. As a result, SW-AES can gain a maximum throughput of 13.50 GB/s on a single SW26010 node, which is 216.23× higher than the latest parallel AES implementation on the Sunway TaihuLight, and about 37.3% higher than the latest AES implementation on the GTX 480 GPU. When running on 1024 computing nodes with each one processing 1 GB data, SW-AES can achieve a throughput of 13819.25 GB/s. On the contrast, only a throughput of 63.91 GB/s can be achieved by the latest related work on the Sunway TaihuLight.



中文翻译:

Sunway TaihuLight超级计算机上的高效AES实现:一种系统方法

加密是提高许多实际应用程序信息安全性的一项重要技术。高级加密标准(AES)是一种广泛使用的高效加密算法。尽管AES在软件和硬件上都很快,但是进行数据加密特别是对大量数据进行加密非常耗时。因此,这是加速AES操作的持久努力。本文介绍了SW-AES,它是Sunway TaihuLight上的并行AES实现,Sunway TaihuLight是世界上最快的超级计算机之一,以SW26010处理器为基本构建块。根据SW26010的体系结构特征,SW-AES利用了不同级别的并行性,包括(1)跨CPE(计算处理元素)数据并行性,可在256个片上CPE之间分配任务,(2)每个CPE内部的单指令多数据(SIMD)指令启用的CPE内部数据并行性,以及(3)流水线化内存访问和计算的指令级并行性。此外,与这两种应用场景相对应,SW-AES提出了可扩展的方式来在许多节点上有效地运行AES。结果,SW-AES在单个SW26010节点上可获得的最大吞吐量为13.50 GB / s,即216.23。×比Sunway TaihuLight上的最新并行AES实施高,并且比GTX 480 GPU上的最新AES实施高约37.3%。当在1024个计算节点上运行且每个节点处理1 GB数据时,SW-AES可以实现13819.25 GB / s的吞吐量。相比之下,双威TaihuLight的最新相关工作只能实现63.91 GB / s的吞吐量。

更新日期:2020-01-04
down
wechat
bug