当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Thermal neutrons: a possible threat for supercomputer reliability
The Journal of Supercomputing ( IF 3.3 ) Pub Date : 2020-05-23 , DOI: 10.1007/s11227-020-03324-9
Daniel Oliveira , Sean Blanchard , Nathan DeBardeleben , Fernando Fernandes dos Santos , Gabriel Piscoya Dávila , Philippe Navaux , Andrea Favalli , Opale Schappert , Stephen Wender , Carlo Cazzaniga , Christopher Frost , Paolo Rech

The high performance, high efficiency, and low cost of Commercial Off-The-Shelf (COTS) devices make them attractive for applications with strict reliability constraints. Today, COTS devices are adopted in HPC and safety-critical applications such as autonomous driving. Unfortunately, the cheap natural boron widely used in COTS chip manufacturing process makes them highly susceptible to thermal (low energy) neutrons. In this paper, we demonstrate that thermal neutrons are a significant threat to COTS device reliability. For our study, we consider two DDR memories, an AMD APU, three NVIDIA GPUs, an Intel accelerator, and an FPGA executing a relevant set of algorithms. We consider different scenarios that impact the thermal neutron flux such as weather, concrete walls and floors, and HPC liquid cooling systems. Correlating beam experiments and neutron detector data, we show that thermal neutrons FIT rate could be comparable or even higher than the high energy neutron FIT rate.

中文翻译:

热中子:对超级计算机可靠性的可能威胁

商用现货 (COTS) 设备的高性能、高效率和低成本使它们对具有严格可靠性限制的应用具有吸引力。今天,COTS 设备被用于 HPC 和安全关键应用,例如自动驾驶。不幸的是,在 COTS 芯片制造过程中广泛使用的廉价天然硼使它们非常容易受到热(低能)中子的影响。在本文中,我们证明了热中子是对 COTS 设备可靠性的重大威胁。在我们的研究中,我们考虑了两个 DDR 内存、一个 AMD APU、三个 NVIDIA GPU、一个英特尔加速器和一个执行相关算法集的 FPGA。我们考虑了影响热中子通量的不同场景,例如天气、混凝土墙和地板以及 HPC 液体冷却系统。
更新日期:2020-05-23
down
wechat
bug