当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving Inference Lifetime of Neuromorphic Systems via Intelligent Synapse Mapping
arXiv - CS - Hardware Architecture Pub Date : 2021-06-16 , DOI: arxiv-2106.09104
Shihao Song, Twisha Titirsha, Anup Das

Non-Volatile Memories (NVMs) such as Resistive RAM (RRAM) are used in neuromorphic systems to implement high-density and low-power analog synaptic weights. Unfortunately, an RRAM cell can switch its state after reading its content a certain number of times. Such behavior challenges the integrity and program-once-read-many-times philosophy of implementing machine learning inference on neuromorphic systems, impacting the Quality-of-Service (QoS). Elevated temperatures and frequent usage can significantly shorten the number of times an RRAM cell can be reliably read before it becomes absolutely necessary to reprogram. We propose an architectural solution to extend the read endurance of RRAM-based neuromorphic systems. We make two key contributions. First, we formulate the read endurance of an RRAM cell as a function of the programmed synaptic weight and its activation within a machine learning workload. Second, we propose an intelligent workload mapping strategy incorporating the endurance formulation to place the synapses of a machine learning model onto the RRAM cells of the hardware. The objective is to extend the inference lifetime, defined as the number of times the model can be used to generate output (inference) before the trained weights need to be reprogrammed on the RRAM cells of the system. We evaluate our architectural solution with machine learning workloads on a cycle-accurate simulator of an RRAM-based neuromorphic system. Our results demonstrate a significant increase in inference lifetime with only a minimal performance impact.

中文翻译:

通过智能突触映射提高神经形态系统的推理寿命

非易失性存储器 (NVM),例如电阻式 RAM (RRAM) 用于神经形态系统中,以实现高密度和低功耗模拟突触权重。不幸的是,RRAM 单元可以在读取其内容一定次数后切换其状态。这种行为挑战了在神经形态系统上实施机器学习推理的完整性和程序一次读取多次的哲学,影响了服务质量 (QoS)。升高的温度和频繁使用可以显着缩短在绝对需要重新编程之前可靠读取 RRAM 单元的次数。我们提出了一种架构解决方案来扩展基于 RRAM 的神经形态系统的读取耐久性。我们做出了两个关键贡献。第一的,我们将 RRAM 单元的读取耐久性公式化为编程突触权重及其在机器学习工作负载中的激活的函数。其次,我们提出了一种智能工作负载映射策略,结合了持久性公式,将机器学习模型的突触放置到硬件的 RRAM 单元上。目标是延长推理生命周期,定义为在训练的权重需要在系统的 RRAM 单元上重新编程之前,模型可用于生成输出(推理)的次数。我们在基于 RRAM 的神经形态系统的周期精确模拟器上使用机器学习工作负载评估我们的架构解决方案。我们的结果表明推理寿命显着增加,而对性能的影响很小。其次,我们提出了一种智能工作负载映射策略,结合了持久性公式,将机器学习模型的突触放置到硬件的 RRAM 单元上。目标是延长推理生命周期,定义为在训练的权重需要在系统的 RRAM 单元上重新编程之前,模型可用于生成输出(推理)的次数。我们在基于 RRAM 的神经形态系统的周期精确模拟器上使用机器学习工作负载评估我们的架构解决方案。我们的结果表明推理寿命显着增加,而对性能的影响很小。其次,我们提出了一种智能工作负载映射策略,结合了持久性公式,将机器学习模型的突触放置到硬件的 RRAM 单元上。目标是延长推理生命周期,定义为在训练的权重需要在系统的 RRAM 单元上重新编程之前,模型可用于生成输出(推理)的次数。我们在基于 RRAM 的神经形态系统的周期精确模拟器上使用机器学习工作负载评估我们的架构解决方案。我们的结果表明推理寿命显着增加,而对性能的影响很小。目标是延长推理生命周期,定义为在训练的权重需要在系统的 RRAM 单元上重新编程之前,模型可用于生成输出(推理)的次数。我们在基于 RRAM 的神经形态系统的周期精确模拟器上使用机器学习工作负载评估我们的架构解决方案。我们的结果表明推理寿命显着增加,而对性能的影响很小。目标是延长推理生命周期,定义为在训练的权重需要在系统的 RRAM 单元上重新编程之前,模型可用于生成输出(推理)的次数。我们在基于 RRAM 的神经形态系统的周期精确模拟器上使用机器学习工作负载评估我们的架构解决方案。我们的结果表明推理寿命显着增加,而对性能的影响很小。
更新日期:2021-06-18
down
wechat
bug