当前位置: X-MOL 学术ACM Trans. Archit. Code Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SMAUG
ACM Transactions on Architecture and Code Optimization ( IF 1.5 ) Pub Date : 2020-11-10 , DOI: 10.1145/3424669
Sam (Likun) Xi 1 , Yuan Yao 1 , Kshitij Bhardwaj 1 , Paul Whatmough 2 , Gu-Yeon Wei 1 , David Brooks 1
Affiliation  

In recent years, there has been tremendous advances in hardware acceleration of deep neural networks. However, most of the research has focused on optimizing accelerator microarchitecture for higher performance and energy efficiency on a per-layer basis. We find that for overall single-batch inference latency, the accelerator may only make up 25–40%, with the rest spent on data movement and in the deep learning software framework. Thus far, it has been very difficult to study end-to-end DNN performance during early stage design (before RTL is available), because there are no existing DNN frameworks that support end-to-end simulation with easy custom hardware accelerator integration. To address this gap in research infrastructure, we present SMAUG, the first DNN framework that is purpose-built for simulation of end-to-end deep learning applications. SMAUG offers researchers a wide range of capabilities for evaluating DNN workloads, from diverse network topologies to easy accelerator modeling and SoC integration. To demonstrate the power and value of SMAUG, we present case studies that show how we can optimize overall performance and energy efficiency for up to 1.8×–5× speedup over a baseline system, without changing any part of the accelerator microarchitecture, as well as show how SMAUG can tune an SoC for a camera-powered deep learning pipeline.

中文翻译:

斯莫格

近年来,深度神经网络的硬件加速取得了巨大的进步。然而,大多数研究都集中在优化加速器微架构,以在每层的基础上实现更高的性能和能源效率。我们发现,对于整体单批次推理延迟,加速器可能仅占 25-40%,其余部分用于数据移动和深度学习软件框架。到目前为止,在早期设计期间(在 RTL 可用之前)研究端到端 DNN 性能一直非常困难,因为没有现有的 DNN 框架支持端到端模拟和简单的自定义硬件加速器集成。为了解决研究基础设施方面的这一差距,我们提出了 SMAUG,这是第一个专门用于模拟端到端深度学习应用程序的 DNN 框架。SMAUG 为研究人员提供了广泛的评估 DNN 工作负载的功能,从各种网络拓扑到简单的加速器建模和 SoC 集成。为了展示 SMAUG 的强大功能和价值,我们提供了案例研究,展示了我们如何在不改变加速器微架构的任何部分以及展示 SMAUG 如何为摄像头驱动的深度学习管道调整 SoC。
更新日期:2020-11-10
down
wechat
bug