当前位置:
X-MOL 学术
›
arXiv.cs.AR
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Lupulus: A Flexible Hardware Accelerator for Neural Networks
arXiv - CS - Hardware Architecture Pub Date : 2020-05-03 , DOI: arxiv-2005.01016 Andreas Toftegaard Kristensen, Robert Giterman, Alexios Balatsoukas-Stimming, and Andreas Burg
arXiv - CS - Hardware Architecture Pub Date : 2020-05-03 , DOI: arxiv-2005.01016 Andreas Toftegaard Kristensen, Robert Giterman, Alexios Balatsoukas-Stimming, and Andreas Burg
Neural networks have become indispensable for a wide range of applications,
but they suffer from high computational- and memory-requirements, requiring
optimizations from the algorithmic description of the network to the hardware
implementation. Moreover, the high rate of innovation in machine learning makes
it important that hardware implementations provide a high level of
programmability to support current and future requirements of neural networks.
In this work, we present a flexible hardware accelerator for neural networks,
called Lupulus, supporting various methods for scheduling and mapping of
operations onto the accelerator. Lupulus was implemented in a 28nm FD-SOI
technology and demonstrates a peak performance of 380 GOPS/GHz with latencies
of 21.4ms and 183.6ms for the convolutional layers of AlexNet and VGG-16,
respectively.
中文翻译:
Lupulus:用于神经网络的灵活硬件加速器
神经网络已成为广泛应用不可或缺的一部分,但它们对计算和内存的要求很高,需要从网络的算法描述到硬件实现进行优化。此外,机器学习的高速创新使得硬件实现提供高度可编程性以支持神经网络当前和未来的需求变得非常重要。在这项工作中,我们提出了一种灵活的神经网络硬件加速器,称为 Lupulus,支持各种方法来调度和映射操作到加速器上。Lupulus 采用 28 纳米 FD-SOI 技术实现,并展示了 380 GOPS/GHz 的峰值性能,AlexNet 和 VGG-16 的卷积层的延迟分别为 21.4 毫秒和 183.6 毫秒。
更新日期:2020-05-05
中文翻译:
Lupulus:用于神经网络的灵活硬件加速器
神经网络已成为广泛应用不可或缺的一部分,但它们对计算和内存的要求很高,需要从网络的算法描述到硬件实现进行优化。此外,机器学习的高速创新使得硬件实现提供高度可编程性以支持神经网络当前和未来的需求变得非常重要。在这项工作中,我们提出了一种灵活的神经网络硬件加速器,称为 Lupulus,支持各种方法来调度和映射操作到加速器上。Lupulus 采用 28 纳米 FD-SOI 技术实现,并展示了 380 GOPS/GHz 的峰值性能,AlexNet 和 VGG-16 的卷积层的延迟分别为 21.4 毫秒和 183.6 毫秒。