HOBFLOPS CNNs: Hardware Optimized Bitsliced Floating-Point Operations Convolutional Neural Networks,arXiv - CS - Hardware Architecture

当前位置： X-MOL 学术 › arXiv.cs.AR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

HOBFLOPS CNNs: Hardware Optimized Bitsliced Floating-Point Operations Convolutional Neural Networks
arXiv - CS - Hardware Architecture Pub Date : 2020-07-11 , DOI: arxiv-2007.06563
James Garland, David Gregg

Convolutional neural network (CNN) inference is commonly performed with 8-bit integer values. However, higher precision floating-point inference is required. Existing processors support 16- or 32 bit FP but do not typically support custom precision FP. We propose hardware optimized bit-sliced floating-point operators (HOBFLOPS), a method of generating efficient custom-precision emulated bitsliced software FP arithmetic, for CNNs. We compare HOBFLOPS8-HOBFLOPS16 performance against SoftFP16 on Arm Neon and Intel architectures. HOBFLOPS allows researchers to prototype arbitrary-levels of FP arithmetic precision for CNN accelerators. Furthermore, HOBFLOPS fast custom-precision FP CNNs in software may be valuable in cases where memory bandwidth is limited.

中文翻译：

HOBFLOPS CNN：硬件优化位切片浮点运算卷积神经网络

卷积神经网络 (CNN) 推理通常使用 8 位整数值执行。但是，需要更高精度的浮点推理。现有处理器支持 16 位或 32 位 FP，但通常不支持自定义精度 FP。我们提出了硬件优化的位切片浮点运算符 (HOBFLOPS)，这是一种为 CNN 生成高效的自定义精度模拟位切片软件 FP 算法的方法。我们在 Arm Neon 和 Intel 架构上比较了 HOBFLOPS8-HOBFLOPS16 与 SoftFP16 的性能。HOBFLOPS 允许研究人员为 CNN 加速器设计任意级别的 FP 算术精度原型。此外，在内存带宽有限的情况下，软件中的 HOBFLOPS 快速定制精度 FP CNN 可能很有价值。

更新日期：2020-07-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>