Analyzing the impact of soft errors in VGG networks implemented on GPUs,Microelectronics Reliability

当前位置： X-MOL 学术 › Microelectron. Reliab. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Analyzing the impact of soft errors in VGG networks implemented on GPUs
Microelectronics Reliability ( IF 1.6 ) Pub Date : 2020-07-01 , DOI: 10.1016/j.microrel.2020.113648
Jinghe Wei , Younis Ibrahim , Siyu Qian , Haibin Wang , Guozhu Liu , Qingkui Yu , Rong Qian , Junwei Shi

Abstract Convolutional Neural Networks (CNNs) models are widely implemented on Graphic Processing Units (GPUs) due to their computational powers, particularly, in object classification and detection. In safety-critical environments, reliability issues of these models induced by soft errors are a major concern. In this paper, we analyze the reliability of VGG model, a well-known CNN architecture, through two metrics: (1) the so-called Kernels Vulnerability Factor (KVF) used to identify vulnerable kernels and (2) the Layer Vulnerability Factor (LVF) used to determine track fault propagation through layers. As our model runs on an NVIDIA's GPU, our main evaluation tool in this study is the fault injector called SASSIFI. Our results show that Im2col presents the highest vulnerability among all kernels and hardening this kernel significantly reduces silent data corruption rate by 85.67% with 35.63% time penalty. The average of errors causing misclassification may reach up to 19.7% in some injection modes and this may be extremely high and unacceptable. Fault injection mode affects both KVF and LVF because RF mode is performed at architecture level, while IOV and IOA modes are at a different level (i.e., program level). Also, a thorough comparison among VGG, AlexNet, and ResNet indicates that LVF and KVF are determined by CNN network architecture. This is because layers of these networks, particularly convolutional layers, are implemented with the Im2col kernel, the most vulnerable and invoked in these CNN networks.

中文翻译：

分析在 GPU 上实现的 VGG 网络中软错误的影响

摘要卷积神经网络 (CNN) 模型由于其强大的计算能力，特别是在对象分类和检测方面，在图形处理单元 (GPU) 上得到了广泛应用。在安全关键环境中，由软错误引起的这些模型的可靠性问题是一个主要问题。在本文中，我们通过两个指标来分析著名的 CNN 架构 VGG 模型的可靠性：（1）用于识别易受攻击的内核的所谓内核漏洞因子（KVF）和（2）层漏洞因子（ LVF) 用于确定通过层的轨道故障传播。由于我们的模型在 NVIDIA 的 GPU 上运行，因此我们在本研究中的主要评估工具是称为 SASSIFI 的故障注入器。我们的结果表明，Im2col 在所有内核中表现出最高的脆弱性，强化该内核可显着降低静默数据损坏率 85.67%，时间损失 35.63%。在某些注入模式下，导致错误分类的平均错误率可能高达 19.7%，这可能非常高且不可接受。故障注入模式会影响 KVF 和 LVF，因为 RF 模式是在架构级别执行的，而 IOV 和 IOA 模式是在不同的级别（即程序级别）。此外，VGG、AlexNet 和 ResNet 之间的彻底比较表明，LVF 和 KVF 是由 CNN 网络架构决定的。这是因为这些网络的层，尤其是卷积层，是使用 Im2col 内核实现的，这是这些 CNN 网络中最脆弱和最容易调用的内核。63% 的时间惩罚。在某些注入模式下，导致错误分类的平均错误率可能高达 19.7%，这可能非常高且不可接受。故障注入模式会影响 KVF 和 LVF，因为 RF 模式是在架构级别执行的，而 IOV 和 IOA 模式是在不同的级别（即程序级别）。此外，VGG、AlexNet 和 ResNet 之间的彻底比较表明，LVF 和 KVF 是由 CNN 网络架构决定的。这是因为这些网络的层，尤其是卷积层，是使用 Im2col 内核实现的，这是这些 CNN 网络中最脆弱和最容易调用的内核。63% 的时间惩罚。在某些注入模式下，导致错误分类的平均错误率可能高达 19.7%，这可能非常高且不可接受。故障注入模式会影响 KVF 和 LVF，因为 RF 模式是在架构级别执行的，而 IOV 和 IOA 模式是在不同的级别（即程序级别）。此外，VGG、AlexNet 和 ResNet 之间的彻底比较表明，LVF 和 KVF 是由 CNN 网络架构决定的。这是因为这些网络的层，尤其是卷积层，是使用 Im2col 内核实现的，这是这些 CNN 网络中最脆弱和最容易调用的内核。故障注入模式会影响 KVF 和 LVF，因为 RF 模式是在架构级别执行的，而 IOV 和 IOA 模式是在不同的级别（即程序级别）。此外，VGG、AlexNet 和 ResNet 之间的彻底比较表明，LVF 和 KVF 是由 CNN 网络架构决定的。这是因为这些网络的层，尤其是卷积层，是使用 Im2col 内核实现的，这是这些 CNN 网络中最脆弱和最容易调用的内核。故障注入模式会影响 KVF 和 LVF，因为 RF 模式是在架构级别执行的，而 IOV 和 IOA 模式是在不同的级别（即程序级别）。此外，VGG、AlexNet 和 ResNet 之间的彻底比较表明，LVF 和 KVF 是由 CNN 网络架构决定的。这是因为这些网络的层，特别是卷积层，是用 Im2col 内核实现的，这是这些 CNN 网络中最脆弱和最容易调用的。

更新日期：2020-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11