当前位置:
X-MOL 学术
›
IEEE Micro
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MEMTI: optimizing on-chip non-volatile storage for visual multi-task inference at the edge
IEEE Micro ( IF 3.6 ) Pub Date : 2019-11-01 , DOI: 10.1109/mm.2019.2944782 Marco Donato 1 , Lillian Pentecost 1 , David Brooks 1 , Gu-Yeon Wei 1
IEEE Micro ( IF 3.6 ) Pub Date : 2019-11-01 , DOI: 10.1109/mm.2019.2944782 Marco Donato 1 , Lillian Pentecost 1 , David Brooks 1 , Gu-Yeon Wei 1
Affiliation
The combination of specialized hardware and embedded nonvolatile memories (eNVM) holds promise for energy-efficient deep neural network (DNN) inference at the edge. However, integrating DNN hardware accelerators with eNVMs still presents several challenges. Multilevel programming is desirable for achieving maximal storage density on chip, but the stochastic nature of eNVM writes makes them prone to errors and further increases the write energy and latency. In this article, we present MEMTI, a memory architecture that leverages a multitask learning technique for maximal reuse of DNN parameters across multiple visual tasks. We show that by retraining and updating only 10% of all DNN parameters, we can achieve efficient model adaptation across a variety of visual inference tasks. The system performance is evaluated by integrating the memory with the open-source NVIDIA deep learning architecture.
中文翻译:
MEMTI:优化用于边缘视觉多任务推理的片上非易失性存储
专用硬件和嵌入式非易失性存储器 (eNVM) 的结合有望在边缘实现节能的深度神经网络 (DNN) 推理。然而,将 DNN 硬件加速器与 eNVM 集成仍然存在一些挑战。多级编程对于在芯片上实现最大存储密度是可取的,但 eNVM 写入的随机性使它们容易出错并进一步增加了写入能量和延迟。在本文中,我们介绍了 MEMTI,这是一种内存架构,它利用多任务学习技术在多个视觉任务中最大程度地重用 DNN 参数。我们表明,通过仅重新训练和更新所有 DNN 参数的 10%,我们可以在各种视觉推理任务中实现高效的模型适应。
更新日期:2019-11-01
中文翻译:
MEMTI:优化用于边缘视觉多任务推理的片上非易失性存储
专用硬件和嵌入式非易失性存储器 (eNVM) 的结合有望在边缘实现节能的深度神经网络 (DNN) 推理。然而,将 DNN 硬件加速器与 eNVM 集成仍然存在一些挑战。多级编程对于在芯片上实现最大存储密度是可取的,但 eNVM 写入的随机性使它们容易出错并进一步增加了写入能量和延迟。在本文中,我们介绍了 MEMTI,这是一种内存架构,它利用多任务学习技术在多个视觉任务中最大程度地重用 DNN 参数。我们表明,通过仅重新训练和更新所有 DNN 参数的 10%,我们可以在各种视觉推理任务中实现高效的模型适应。