当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training and Adaptive Parallelism
arXiv - CS - Hardware Architecture Pub Date : 2021-02-24 , DOI: arxiv-2102.12103
Je Yang, Seongmin Hong, Joo-Young Kim

In this paper, we present a deep reinforcement learning platform named FIXAR which employs fixed-point data types and arithmetic units for the first time using a SW/HW co-design approach. Starting from 32-bit fixed-point data, Quantization-Aware Training (QAT) reduces its data precision based on the range of activations and performs retraining to minimize the reward degradation. FIXAR proposes the adaptive array processing core composed of configurable processing elements to support both intra-layer parallelism and intra-batch parallelism for high-throughput inference and training. Finally, FIXAR was implemented on Xilinx U50 and achieves 25293.3 inferences per second (IPS) training throughput and 2638.0 IPS/W accelerator efficiency, which is 2.7 times faster and 15.4 times more energy efficient than those of the CPU-GPU platform without any accuracy degradation.

中文翻译:

FIXAR:具有量化意识训练和自适应并行性的定点深度强化学习平台

在本文中,我们提出了一个名为FIXAR的深度强化学习平台,该平台首次使用SW / HW协同设计方法使用定点数据类型和算术单元。从32位定点数据开始,量化意识训练(QAT)会根据激活范围降低其数据精度,并执行重新训练以最大程度地减少奖励降级。FIXAR提出了一种由可配置处理元素组成的自适应阵列处理核心,以支持层内并行性和批内并行性,以进行高吞吐量的推理和训练。最终,FIXAR在Xilinx U50上实现,并实现了每秒25293.3推论(IPS)的训练吞吐量和2638.0 IPS / W加速器效率,这是2.7倍的速度和15的速度。
更新日期:2021-02-25
down
wechat
bug