当前位置: X-MOL 学术IEEE Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Low-Latency Communication Design for Brain Simulations
IEEE NETWORK ( IF 6.8 ) Pub Date : 5-30-2022 , DOI: 10.1109/mnet.008.2100447
Xin Du 1 , Yuhao Liu 1 , Zhihui Lu 1 , Qiang Duan 2 , Jianfeng Feng 1 , Jie Wu 1 , Boyu Chen 1 , Qibao Zheng 1
Affiliation  

Brain simulation, as one of the latest advances in artificial intelligence, facilitates better understanding about how information is represented and processed in the brain. The extreme complexity of the human brain makes brain simulations only feasible on high-performance computing platforms. Supercomputers with a large number of interconnected graphical processing units (GPUs) are currently employed for supporting brain simulations. Therefore, high-throughput low-latency inter-GPU communications in super-computers play a crucial role in meeting the performance requirements of brain simulation as a highly time-sensitive application. In this article, we first provide an overview of the current parallelizing technologies for brain simulations using multi-GPU architectures. Then we analyze the challenges to communications for brain simulation and summarize guidelines for communication design to address such challenges. Furthermore, we propose a partitioning algorithm and a two-level routing method to achieve efficient low-latency communications in multi-GPU architecture for brain simulation. We report experiment results obtained on a supercomputer with 2000 GPUs for simulating a brain model with 10 billion neurons (digital twin brain, DTB) to show that our approach can significantly improve communication performance. We also discuss open issues and identify some research directions for low-latency communication design for brain simulations.

中文翻译:


用于大脑模拟的低延迟通信设计



大脑模拟作为人工智能的最新进展之一,有助于更好地理解信息在大脑中如何表示和处理。人脑的极端复杂性使得大脑模拟只能在高性能计算平台上可行。目前采用具有大量互连图形处理单元(GPU)的超级计算机来支持大脑模拟。因此,超级计算机中高吞吐量、低延迟的 GPU 间通信对于满足大脑模拟这一高度时间敏感应用的性能要求起着至关重要的作用。在本文中,我们首先概述当前使用多 GPU 架构进行大脑模拟的并行化技术。然后,我们分析了大脑模拟通信面临的挑战,并总结了应对这些挑战的通信设计指南。此外,我们提出了一种分区算法和两级路由方法,以在用于大脑模拟的多 GPU 架构中实现高效的低延迟通信。我们报告了在具有 2000 个 GPU 的超级计算机上模拟具有 100 亿个神经元的大脑模型(数字孪生大脑,DTB)获得的实验结果,表明我们的方法可以显着提高通信性能。我们还讨论了悬而未决的问题,并确定了大脑模拟低延迟通信设计的一些研究方向。
更新日期:2024-08-22
down
wechat
bug