当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation
arXiv - CS - Hardware Architecture Pub Date : 2020-01-14 , DOI: arxiv-2001.04974
Chuteng Zhou, Prad Kadambi, Matthew Mattina, Paul N. Whatmough

The success of deep learning has brought forth a wave of interest in computer hardware design to better meet the high demands of neural network inference. In particular, analog computing hardware has been heavily motivated specifically for accelerating neural networks, based on either electronic, optical or photonic devices, which may well achieve lower power consumption than conventional digital electronics. However, these proposed analog accelerators suffer from the intrinsic noise generated by their physical components, which makes it challenging to achieve high accuracy on deep neural networks. Hence, for successful deployment on analog accelerators, it is essential to be able to train deep neural networks to be robust to random continuous noise in the network weights, which is a somewhat new challenge in machine learning. In this paper, we advance the understanding of noisy neural networks. We outline how a noisy neural network has reduced learning capacity as a result of loss of mutual information between its input and output. To combat this, we propose using knowledge distillation combined with noise injection during training to achieve more noise robust networks, which is demonstrated experimentally across different networks and datasets, including ImageNet. Our method achieves models with as much as two times greater noise tolerance compared with the previous best attempts, which is a significant step towards making analog hardware practical for deep learning.

中文翻译:

嘈杂的机器:了解嘈杂的神经网络并使用蒸馏增强对模拟硬件错误的鲁棒性

深度学习的成功引发了对计算机硬件设计的兴趣,以更好地满足神经网络推理的高要求。特别是,基于电子、光学或光子器件的模拟计算硬件特别被用于加速神经网络,与传统的数字电子器件相比,这些器件的功耗可能会更低。然而,这些提议的模拟加速器受到其物理组件产生的固有噪声的影响,这使得在深度神经网络上实现高精度具有挑战性。因此,为了在模拟加速器上成功部署,必须能够训练深度神经网络对网络权重中的随机连续噪声具有鲁棒性,这是机器学习中的一个新挑战。在本文中,我们推进了对嘈杂神经网络的理解。我们概述了嘈杂的神经网络如何由于其输入和输出之间的互信息丢失而降低学习能力。为了解决这个问题,我们建议在训练期间使用知识蒸馏与噪声注入相结合,以实现更多噪声鲁棒性网络,这在包括 ImageNet 在内的不同网络和数据集上进行了实验证明。与之前的最佳尝试相比,我们的方法实现的模型具有高达两倍的噪声容限,这是朝着使模拟硬件适用于深度学习迈出的重要一步。我们建议在训练期间使用知识蒸馏与噪声注入相结合,以实现更多噪声鲁棒网络,这在包括 ImageNet 在内的不同网络和数据集上进行了实验证明。与之前的最佳尝试相比,我们的方法实现的模型具有高达两倍的噪声容限,这是朝着使模拟硬件适用于深度学习迈出的重要一步。我们建议在训练期间使用知识蒸馏与噪声注入相结合,以实现更多噪声鲁棒网络,这在包括 ImageNet 在内的不同网络和数据集上进行了实验证明。与之前的最佳尝试相比,我们的方法实现的模型具有高达两倍的噪声容限,这是朝着使模拟硬件适用于深度学习迈出的重要一步。
更新日期:2020-01-15
down
wechat
bug