当前位置: X-MOL 学术IEEE Trans. Very Larg. Scale Integr. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RNN-Based Radio Resource Management on Multicore RISC-V Accelerator Architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems ( IF 2.8 ) Pub Date : 2021-07-12 , DOI: 10.1109/tvlsi.2021.3093242
Gianna Paulin , Renzo Andri , Francesco Conti , Luca Benini

Radio resource management (RRM) is critical in 5G mobile communications due to its ubiquity on every radio device and its low latency constraints. The rapidly evolving RRM algorithms with low latency requirements combined with the dense and massive 5G base station deployment ask for an on-the-edge RRM acceleration system with a tradeoff between flexibility, efficiency, and cost-making application-specific instruction-set processors (ASIPs) an optimal choice. In this work, we start from a baseline, simple RISC-V core and introduce instruction extensions coupled with software optimizations for maximizing the throughput of a selected set of recently proposed RRM algorithms based on models using multilayer perceptrons (MLPs) and recurrent neural networks (RNNs). Furthermore, we scale from a single-ASIP to a multi-ASIP acceleration system to further improve RRM throughput. For the single-ASIP system, we demonstrate an energy efficiency of 218 GMAC/s/W and a throughput of 566 MMAC/s corresponding to an improvement of $10\times $ and $10.6\times $ , respectively, over the single-core system with a baseline RV32IMC core. For the multi-ASIP system, we analyze the parallel speedup dependency on the input and output feature map (FM) size for fully connected and LSTM layers, achieving up to $10.2\times $ speedup with 16 cores over a single extended RI5CY core for single LSTM layers and a speedup of $13.8\times $ for a single fully connected layer. On the full RRM benchmark suite, we achieve an average overall speedup of $16.4\times $ , $25.2\times $ , $31.9\times $ , and $38.8\times $ on two, four, eight, and 16 cores, respectively, compared to our single-core RV32IMC baseline implementation.

中文翻译:

多核 RISC-V 加速器架构上基于 RNN 的无线电资源管理

无线电资源管理 (RRM) 在 5G 移动通信中至关重要,因为它在每个无线电设备上无处不在,而且其低延迟限制。具有低延迟要求的快速发展的 RRM 算法与密集且大规模的 5G 基站部署相结合,要求在灵活性、效率和成本制造应用特定指令集处理器之间进行权衡的边缘 RRM 加速系统( ASIPs) 最佳选择。在这项工作中,我们从一个简单的基线 RISC-V 核心开始,引入指令扩展与软件优化相结合,以最大化最近提出的一组选定的 RRM 算法的吞吐量,这些算法基于使用多层感知器 (MLP) 和循环神经网络的模型。 RNN)。此外,我们从单 ASIP 扩展到多 ASIP 加速系统,以进一步提高 RRM 吞吐量。对于单 ASIP 系统,我们展示了 218 GMAC/s/W 的能效和 566 MMAC/s 的吞吐量,对应于 $10\times $ $10.6\times $ ,分别超过具有基线 RV32IMC 内核的单核系统。对于多 ASIP 系统,我们分析了全连接层和 LSTM 层对输入和输出特征图 (FM) 大小的并行加速依赖性,实现了高达 $10.2\times $ 在单个 LSTM 层的单个扩展 RI5CY 内核上实现 16 个内核的加速,并且加速了 $13.8\times $ 对于单个全连接层。在完整的 RRM 基准测试套件上,我们实现了平均整体加速 $16.4\times $ , $25.2\times $ , $31.9\times $ , 和 $38.8\times $ 与我们的单核 RV32IMC 基线实现相比,分别在 2、4、8 和 16 核上。
更新日期:2021-08-31
down
wechat
bug