当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dataflow-Architecture Co-Design for 2.5D DNN Accelerators using Wireless Network-on-Package
arXiv - CS - Hardware Architecture Pub Date : 2020-11-30 , DOI: arxiv-2011.14755
Robert Guirado, Hyoukjun Kwon, Sergi Abadal, Eduard Alarcón, Tushar Krishna

Deep neural network (DNN) models continue to grow in size and complexity, demanding higher computational power to enable real-time inference. To efficiently deliver such computational demands, hardware accelerators are being developed and deployed across scales. This naturally requires an efficient scale-out mechanism for increasing compute density as required by the application. 2.5D integration over interposer has emerged as a promising solution, but as we show in this work, the limited interposer bandwidth and multiple hops in the Network-on-Package (NoP) can diminish the benefits of the approach. To cope with this challenge, we propose WIENNA, a wireless NoP-based 2.5D DNN accelerator. In WIENNA, the wireless NoP connects an array of DNN accelerator chiplets to the global buffer chiplet, providing high-bandwidth multicasting capabilities. Here, we also identify the dataflow style that most efficienty exploits the wireless NoP's high-bandwidth multicasting capability on each layer. With modest area and power overheads, WIENNA achieves 2.2X--5.1X higher throughput and 38.2% lower energy than an interposer-based NoP design.

中文翻译:

使用无线封装式网络的2.5D DNN加速器的数据流架构协同设计

深度神经网络(DNN)模型的规模和复杂性不断增长,需要更高的计算能力才能实现实时推理。为了有效地满足这种计算需求,正在跨规模开发和部署硬件加速器。这自然需要一种有效的横向扩展机制,以根据应用程序的要求增加计算密度。通过中介层进行2.5D集成已成为一种有前途的解决方案,但是正如我们在本工作中所展示的那样,中介层带宽有限以及网络封装(NoP)中的多跳会降低该方法的优势。为了应对这一挑战,我们提出了WIENNA,一种基于无线NoP的2.5D DNN加速器。在WIENNA中,无线NoP将DNN加速器小芯片阵列连接到全局缓冲区小芯片,从而提供高带宽多播功能。这里,我们还确定了最有效地利用无线NoP在每一层上的高带宽多播功能的数据流样式。与基于中介层的NoP设计相比,WIENNA具有适度的面积和功率开销,实现了2.2倍至5.1倍的吞吐量提高,能耗降低了38.2%。
更新日期:2020-12-01
down
wechat
bug