当前位置:
X-MOL 学术
›
arXiv.cs.AR
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Stochastic Rounding: Algorithms and Hardware Accelerator
arXiv - CS - Hardware Architecture Pub Date : 2020-01-06 , DOI: arxiv-2001.01501 Mantas Mikaitis
arXiv - CS - Hardware Architecture Pub Date : 2020-01-06 , DOI: arxiv-2001.01501 Mantas Mikaitis
Algorithms and a hardware accelerator for performing stochastic rounding (SR)
are presented. The main goal is to augment the ARM M4F based multi-core
processor SpiNNaker2 with a more flexible rounding functionality than is
available in the ARM processor itself. The motivation of adding such an
accelerator in hardware is based on our previous results showing improvements
in numerical accuracy of ODE solvers in fixed-point arithmetic with SR,
compared to standard round to nearest or bit truncation rounding modes.
Furthermore, performing SR purely in software can be expensive, due to
requirement of a pseudorandom number generator (PRNG), multiple masking and
shifting instructions, and an addition operation. Also, saturation of the
rounded values is included, since rounding is usually followed by saturation,
which is especially important in fixed-point arithmetic due to a narrow dynamic
range of representable values. The main intended use of the accelerator is to
round fixed-point multiplier outputs, which are returned unrounded by the ARM
processor in a wider fixed-point format than the arguments.
中文翻译:
随机舍入:算法和硬件加速器
介绍了用于执行随机舍入 (SR) 的算法和硬件加速器。主要目标是通过比 ARM 处理器本身更灵活的舍入功能来增强基于 ARM M4F 的多核处理器 SpiNNaker2。在硬件中添加此类加速器的动机是基于我们之前的结果,该结果表明,与标准舍入到最近或位截断舍入模式相比,使用 SR 的定点算术中 ODE 求解器的数值精度有所提高。此外,由于需要伪随机数生成器 (PRNG)、多个屏蔽和移位指令以及加法运算,因此纯粹在软件中执行 SR 可能会很昂贵。此外,还包括四舍五入值的饱和度,因为四舍五入后通常是饱和度,由于可表示值的动态范围很窄,这在定点算术中尤为重要。加速器的主要用途是对定点乘法器输出进行舍入,ARM 处理器以比参数更宽的定点格式返回这些输出。
更新日期:2020-07-01
中文翻译:
随机舍入:算法和硬件加速器
介绍了用于执行随机舍入 (SR) 的算法和硬件加速器。主要目标是通过比 ARM 处理器本身更灵活的舍入功能来增强基于 ARM M4F 的多核处理器 SpiNNaker2。在硬件中添加此类加速器的动机是基于我们之前的结果,该结果表明,与标准舍入到最近或位截断舍入模式相比,使用 SR 的定点算术中 ODE 求解器的数值精度有所提高。此外,由于需要伪随机数生成器 (PRNG)、多个屏蔽和移位指令以及加法运算,因此纯粹在软件中执行 SR 可能会很昂贵。此外,还包括四舍五入值的饱和度,因为四舍五入后通常是饱和度,由于可表示值的动态范围很窄,这在定点算术中尤为重要。加速器的主要用途是对定点乘法器输出进行舍入,ARM 处理器以比参数更宽的定点格式返回这些输出。