当前位置: X-MOL 学术IEEE Trans. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Device-Circuit-Architecture Co-Exploration for Computing-in-Memory Neural Accelerators
IEEE Transactions on Computers ( IF 3.7 ) Pub Date : 2020-01-01 , DOI: 10.1109/tc.2020.2991575
Weiwen Jiang 1 , Qiuwen Lou 1 , Zheyu Yan 1 , Lei Yang 1 , Jingtong Hu 2 , Xiaobo Sharon Hu 1 , Yiyu Shi 1
Affiliation  

Co-exploration of neural architectures and hardware design is promising to simultaneously optimize network accuracy and hardware efficiency. However, state-of-the-art neural architecture search algorithms for the co-exploration are dedicated for the conventional von-neumann computing architecture, whose performance is heavily limited by the well-known memory wall. In this paper, we are the first to bring the computing-in-memory architecture, which can easily transcend the memory wall, to interplay with the neural architecture search, aiming to find the most efficient neural architectures with high network accuracy and maximized hardware efficiency. Such a novel combination makes opportunities to boost performance, but also brings a bunch of challenges. The design space spans across multiple layers from device type, circuit topology to neural architecture. In addition, the performance may degrade in the presence of device variation. To address these challenges, we propose a cross-layer exploration framework, namely NACIM, which jointly explores device, circuit and architecture design space and takes device variation into consideration to find the most robust neural architectures. Experimental results demonstrate that NACIM can find the robust neural network with 0.45% accuracy loss in the presence of device variation, compared with a 76.44% loss from the state-of-the-art NAS without consideration of variation; in addition, NACIM achieves an energy efficiency up to 16.3 TOPs/W, 3.17X higher than the state-of-the-art NAS.

中文翻译:

内存计算神经加速器的设备-电路-架构协同探索

神经架构和硬件设计的共同探索有望同时优化网络准确性和硬件效率。然而,用于协同探索的最先进的神经架构搜索算法专用于传统的冯诺依曼计算架构,其性能受到众所周知的内存墙的严重限制。在本文中,我们首先将可以轻松超越内存墙的内存计算架构与神经架构搜索相互作用,旨在找到具有高网络精度和最大化硬件效率的最有效的神经架构. 这种新颖的组合为提升性能创造了机会,但也带来了一系列挑战。设计空间跨越设备类型的多个层,电路拓扑到神经架构。此外,在存在设备变化的情况下,性能可能会下降。为了应对这些挑战,我们提出了一个跨层探索框架,即 NACIM,它联合探索设备、电路和架构设计空间,并考虑设备变化以找到最健壮的神经架构。实验结果表明,NACIM 可以在存在设备变化的情况下找到具有 0.45% 精度损失的鲁棒神经网络,而在不考虑变化的情况下,最先进的 NAS 损失 76.44%;此外,NACIM 实现了高达 16.3 TOPs/W 的能效,比最先进的 NAS 高 3.17 倍。我们提出了一个跨层探索框架,即 NACIM,它联合探索设备、电路和架构设计空间,并考虑设备变化以找到最稳健的神经架构。实验结果表明,NACIM 可以在存在设备变化的情况下找到具有 0.45% 精度损失的鲁棒神经网络,而在不考虑变化的情况下,最先进的 NAS 损失 76.44%;此外,NACIM 实现了高达 16.3 TOPs/W 的能效,比最先进的 NAS 高 3.17 倍。我们提出了一个跨层探索框架,即 NACIM,它联合探索设备、电路和架构设计空间,并考虑设备变化以找到最健壮的神经架构。实验结果表明,NACIM 可以在存在设备变化的情况下找到具有 0.45% 精度损失的鲁棒神经网络,而在不考虑变化的情况下,最先进的 NAS 损失 76.44%;此外,NACIM 实现了高达 16.3 TOPs/W 的能效,比最先进的 NAS 高 3.17 倍。在存在设备变化的情况下准确度损失 45%,而在不考虑变化的情况下,最先进的 NAS 损失 76.44%;此外,NACIM 实现了高达 16.3 TOPs/W 的能效,比最先进的 NAS 高 3.17 倍。在存在设备变化的情况下准确度损失 45%,而在不考虑变化的情况下,最先进的 NAS 损失 76.44%;此外,NACIM 实现了高达 16.3 TOPs/W 的能效,比最先进的 NAS 高 3.17 倍。
更新日期:2020-01-01
down
wechat
bug