当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning
arXiv - CS - Hardware Architecture Pub Date : 2021-02-19 , DOI: arxiv-2102.10140
D. Dang, S. V. R. Chittamuru, S. Pasricha, R. Mahapatra, D. Sahoo

Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation algorithm (BP). This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pre-trained weights and focus only on improving the design of the inference phase. The recent trend is to build a complete deep learning accelerator by incorporating the training module. Such efforts require an ultra-fast chip architecture for executing the BP algorithm. In this article, we propose a novel photonics-based backpropagation accelerator for high performance deep learning training. We present the design for a convolutional neural network, BPLight-CNN, which incorporates the silicon photonics-based backpropagation accelerator. BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN architecture for end-to-end training and prediction. We evaluate BPLight-CNN using a photonic CAD framework (IPKISS) on deep learning benchmark models including LeNet and VGG-Net. The proposed design achieves (i) at least 34x speedup, 34x improvement in computational efficiency, and 38.5x energy savings, during training; and (ii) 29x speedup, 31x improvement in computational efficiency, and 38.7x improvement in energy savings, during inference compared to the state-of-the-art designs. All these comparisons are done at a 16-bit resolution; and BPLight-CNN achieves these improvements at a cost of approximately 6% lower accuracy compared to the state-of-the-art.

中文翻译:

BPLight-CNN:用于深度学习的基于光子学的反向传播加速器

训练深度学习网络需要在使用反向传播算法(BP)的同时跨深度网络的各个层进行连续的权重更新。这导致训练期间昂贵的计算开销。因此,当今大多数深度学习加速器采用预先训练的权重,并且只专注于改善推理阶段的设计。最近的趋势是通过合并培训模块来构建完整的深度学习加速器。这样的努力需要用于执行BP算法的超快速芯片架构。在本文中,我们提出了一种用于高性能深度学习训练的新型基于光子学的反向传播加速器。我们介绍了卷积神经网络BPLight-CNN的设计,该网络结合了基于硅光子学的反向传播加速器。BPLight-CNN是第一个基于光子和忆阻器的CNN架构,用于端到端的训练和预测。我们在包括LeNet和VGG-Net在内的深度学习基准模型上使用光子CAD框架(IPKISS)评估BPLight-CNN。拟议的设计在训练期间至少实现了(i)至少34倍的速度提升,34倍的计算效率提升和38.5倍的节能量;(ii)与最新设计相比,推理期间的速度提高了29倍,计算效率提高了31倍,节能量提高了38.7倍。所有这些比较均以16位分辨率进行;BPLight-CNN以比最新技术低约6%的精度实现了这些改进。我们在包括LeNet和VGG-Net在内的深度学习基准模型上使用光子CAD框架(IPKISS)评估BPLight-CNN。拟议的设计在训练期间至少实现了(i)至少34倍的速度提升,34倍的计算效率提升和38.5倍的节能量;(ii)与最新设计相比,推理期间的速度提高了29倍,计算效率提高了31倍,节能量提高了38.7倍。所有这些比较均以16位分辨率进行;BPLight-CNN以比最新技术低约6%的精度实现了这些改进。我们在包括LeNet和VGG-Net在内的深度学习基准模型上使用光子CAD框架(IPKISS)评估BPLight-CNN。拟议的设计在训练期间至少实现了(i)至少34倍的速度提升,34倍的计算效率提升和38.5倍的节能量;(ii)与最新设计相比,推理期间的速度提高了29倍,计算效率提高了31倍,节能量提高了38.7倍。所有这些比较均以16位分辨率进行;BPLight-CNN以比最新技术低约6%的精度实现了这些改进。(ii)与最新设计相比,推理期间的速度提高了29倍,计算效率提高了31倍,节能量提高了38.7倍。所有这些比较均以16位分辨率进行;BPLight-CNN以比最新技术低约6%的精度实现了这些改进。(ii)与最新设计相比,推理期间的速度提高了29倍,计算效率提高了31倍,节能量提高了38.7倍。所有这些比较均以16位分辨率进行;BPLight-CNN以比最新技术低约6%的精度实现了这些改进。
更新日期:2021-02-23
down
wechat
bug