当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
VPU-EM: An Event-based Modeling Framework to Evaluate NPU Performance and Power Efficiency at Scale
arXiv - CS - Hardware Architecture Pub Date : 2023-03-17 , DOI: arxiv-2303.10271
Charles Qi, Yi Wang, Hui Wang, Yang Lu, Shiva Shankar Subramanian, Finola Cahill, Conall Tuohy, Victor Li, Xu Qian, Darren Crews, Ling Wang, Shivaji Roy, Andrea Deidda, Martin Power, Niall Hanrahan, Rick Richmond, Umer Cheema, Arnab Raha, Alessandro Palla, Gary Baugh, Deepak Mathaikutty

State-of-art NPUs are typically architected as a self-contained sub-system with multiple heterogeneous hardware computing modules, and a dataflow-driven programming model. There lacks well-established methodology and tools in the industry to evaluate and compare the performance of NPUs from different architectures. We present an event-based performance modeling framework, VPU-EM, targeting scalable performance evaluation of modern NPUs across diversified AI workloads. The framework adopts high-level event-based system-simulation methodology to abstract away design details for speed, while maintaining hardware pipelining, concurrency and interaction with software task scheduling. It is natively developed in Python and built to interface directly with AI frameworks such as Tensorflow, PyTorch, ONNX and OpenVINO, linking various in-house NPU graph compilers to achieve optimized full model performance. Furthermore, VPU-EM also provides the capability to model power characteristics of NPU in Power-EM mode to enable joint performance/power analysis. Using VPU-EM, we conduct performance/power analysis of models from representative neural network architecture. We demonstrate that even though this framework is developed for Intel VPU, an Intel in-house NPU IP technology, the methodology can be generalized for analysis of modern NPUs.

中文翻译:

VPU-EM:一种基于事件的建模框架,用于大规模评估 NPU 性能和能效

最先进的 NPU 通常被设计为一个独立的子系统,具有多个异构硬件计算模块和一个数据流驱动的编程模型。业界缺乏完善的方法和工具来评估和比较不同架构的 NPU 的性能。我们提出了一个基于事件的性能建模框架 VPU-EM,目标是跨多样化 AI 工作负载对现代 NPU 进行可扩展的性能评估。该框架采用基于事件的高级系统仿真方法来抽象出设计细节以提高速度,同时保持硬件流水线、并发性和与软件任务调度的交互。它是用 Python 原生开发的,旨在直接与 Tensorflow、PyTorch、ONNX 和 OpenVINO 等人工智能框架交互,连接各种内部 NPU 图编译器以实现优化的完整模型性能。此外,VPU-EM 还提供了在 Power-EM 模式下对 NPU 的功率特性进行建模的功能,以实现联合性能/功率分析。使用 VPU-EM,我们对具有代表性的神经网络架构的模型进行性能/功耗分析。我们证明,即使该框架是为英特尔 VPU(一种英特尔内部 NPU IP 技术)开发的,该方法也可以推广用于现代 NPU 的分析。
更新日期:2023-03-21
down
wechat
bug